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(57) Abstract: The invention provides methods and compositions for engineering cells to generate large amounts of hydrogen. 
Genes that are involved in hydrogen production pathways and genes that are upregulated when cells are exposed to conditions con- 
ducive to the generation of hydrogen are mutagenized according to disclosed protocols. Microbes containing nucleic acid constructs 
are screened or selected for the ability to generate an increased amount of hydrogen. Methods of producing hydrogen are also dis- 
closed. 
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Methods and compositions for evolving microbial hydrogen production 

[001] This application claims priority to U.S. Patent Application No.: 10/287,750, filed November 4, 2002. This 
application also claims priority to U.S. Patent Application No.: 10/411,910, filed April 12, 2003. This application 
also claims priority to U.S. Patent Application No.: 60/500,032, filed September 3, 2003. U.S. Patent Applications 
5 10/287,750, 10/41 1,910, and 60/500,032 are hereby fully incorporated by reference for all purposes. 

BACKGROUND OF THE INVENTION 

[002] Hydrogen is the most abundant element on earth. When hydrogen is burned as a fuel, the only byproducts 
are heat and water. Large-scale commercial production of hydrogen could have a massive impact on the world 
environment and economy. The availability of an environmentally clean, renewable energy source would greatly 
curtail if not end large-scale dependence on fossil fuels. Hydrogen can be converted into electrical energy by 
utilizing fuel cells, but it would also be an ideal replacement for oil-based energy since it has a calorie per unit 
weight of 3 to 4 times that of petroleum (United States Patent 4,532,210). 

[003] Fuel cell technology is being developed at a rapid pace, however a plentiful and commercially viable source 
of hydrogen with which to run fuel cells has not yet been created. There are a variety of known methods for 
producing hydrogen. For instance, inorganic membrane electrolysis technology (IMET) involves the splitting of 
water through electrolysis in the reaction 2H 2 0 => 2H 2 + 0 2 . Water electrolysis occurs through passing an electric 
current through water to separate it into hydrogen and oxygen. Hydrogen gas is produced at the negative cathode 
and oxygen gas is produced at the positive anode. Another source of hydrogen production is through reforming 
natural gas. Unfortunately this process produces carbon dioxide making this source of hydrogen less than ideal. 
[004] Hydrogen production through electrolysis, powered by renewable sources such as wind, solar energy through 
photovoltaic cells, or hydroelectric power has the advantage of not creating pollutants in the process of generating 
hydrogen, however the potential amount of hydrogen that can be produced through these methods may be limiting. 
[005] What is needed are methods for engineering microbial organisms to produce hydrogen for extended periods 
of time in large amounts, something no known microbe is currently capable of doing. Furthermore, methods of 
identifying genes that are involved in hydrogen production pathways of microbes so that they can be optimized for 
efficient contribution to the production of hydrogen are needed. 

BRIEF SUMMARY OF THE INVENTION 
30 [006] Provided are sethod for engineering a cell to produce an increased amount of hydrogen comprising providing 
a mutagenized nucleic acid sequence derived from a first gene that encodes a protein involved in a hydrogen 
production pathway, transforming a cell with the mutagenized nucleic acid sequence, and screening or selecting the 
cell for an increased amount of hydrogen. 

[007] Methods are provided for identifying a first independent transformant which produces an increased amount 
35 of hydrogen, recovering the mutagenized nucleic acid sequence from the independent transformant, further 
mutagenizing the recovered mutagenized nucleic acid sequence to create a new library of mutagenized nucleic acid 
sequences, transforming cells with the new library of mutagenized nucleic acid sequences, and screening or 
selecting for a new independent transformant that generates an increased amount of hydrogen compared to the first 
independent transformant. 

40 [008] In some methods a plurality of mutagenized nucleic acid sequences are recovered from a plurality of 
independent transformants which produce an increased amount of hydrogen, wherein the plurality of mutagenized 
nucleic acid sequences are subjected to gene reassembly to generate the new library. 
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[009] In one embodiment a plurality of mutagenized nucleic acid sequences are used to transform a population of 
cells, followed by the screening or selecting. 

In one embodiment the first gene is selected from the group that encodes ferredoxin, catalase, isoamylase, malate 
dehydrogenase, 14-3-3 protein, enolase, aldolase, ribosomal protein S8, ribosomal protein L17, ribosomal protein 
5 S18, ribosomal protein L37, ribosomal protein L12, ribosomal protein S15, iron-hydrogenase, nickel-iron 
hydrogenase, and components of the photosystem I, photosystem II, light harvesting antenna and cytochrome b 6 -f 
complexes, 

[010] The methods provided include mutagenesis of iron hydrogenase proteins including mutagenesis of the 
X 1 X 2 X 3 FX 4 X 5 X 6 GGVMEAAX 7 R and ADX 8 TIX 9 EE segments. In some methods, cognate sequences of these 

10 conserved segments of iron hydrogenases are substituted into a Chlamydomonas iron hydrogenase. In some 
methods, gene reassembly methods are performed in which a Chlamydomonas iron hydrogenase is mutagenized by 
incorporation of segments of iron hydrogenase proteins from other species. Preferred segments for inclusion in 
gene reassembly include segments that form parts of the gas channel, also referred to as the gas channel. In some 
methods a higher molecular weight amino acis is substituted into a gas channel segment, such as a tryptophan for 

15 the methionine in the C. reinhardtii TIMEE segment. In other gene reassembly methods the iron hydrogenase is 
reassembled using methods that involve attaching sections of duplex DNA that have only one overhanging 
nucleotide. In other methods oligonucleotides encoding gas channel segments are annealed to a scaffold nucleic 
acid, where the oligonucleotides anneal to non-overlapping sites. Preferably, the mutagenesis of a hydrogenase 
does not decrease the protein's ability to accept electrons from an electron donor. In some methods the 

20 mutagenized nucleic acid is transcribed by a light-driven promoter. 

[011] Methods are provided herein for screening or selecting for a hydrogen production phenotype in the presence 
of oxygen at a concentration selected from the ranges comprising more than 0.5%, more than 5.0%, more than 10%, 
more than 15%, approximately 21%, more than 21%, more than 25%, more than 30% or more than 35% oxygen. In 
some methods the cells screened or selected are in liquid culture media. 

25 [012] Methods are provided for mating (a) at least one cell of a strain containing a mutagenized form of the first 
gene, wherein the at least one cell is identified by the screening or selecting or wherein the at least one cell is 
derived through mating from a cell identified by the screening or selecting; (b) to at least one cell of a distinct strain 
containing a mutagenized form of the second gene, wherein the at least one cell is identified by the screening or 
selecting, or wherein the at least one cell is derived through mating from a cell identified by the screening or 

30 selecting; and (c) screening or selecting for a progeny cell that produces an increased amount of hydrogen compared 
to any parent cell. 

[013] A method of hydrogen production is disclosed, comprising placing cell containing a mutagenized nucleic 
acid sequence corresponding to a gene that is involved in a hydrogen production pathway into liquid culture media 
or on to solid culture media, wherein the mutagenized nucleic acid sequence is operably linked to a transcriptional 
35 promoter sequence; culturing said transformed cell under conditions sufficient to stimulate transcription of said 
mutagenized nucleic acid sequence(s); and collecting an evolved gas. In some methods the culture media supplied 
to the cells is photoautotrophic growth requiring media. 

[014] Mating methods are provided. One method is a method of multiparental mating of microbes that mate in 
response to a stimulus, comprising: (a) providing a cell from each of 3 or more strains of microbes capable of 
40 mating to each other in culture medium; (b) providing the stimulus; (c) allowing cells to mate and produce progeny; 
(d) allowing the progeny cells to achieve sexual reproduction capability; (e) providing the stimulus at least one 
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more time; and (f) screening or selecting the further progeny for a desired phenotype. In some methods 

the microbes are green algae and the stimulus is the removal of nitrogen from the media and illumination by light 

comprising a wavelength of light between about 0.42-0.52 micrometers. 

In some methods the green algae are of the Chlamydomonas genus, optionally of a species selected from the group 
5 comprising reinhardtii, eugametos, incerta, and moewusii. In other methods the stimulus is interruption of 
exponential growth in continuous light with a reduction in light, followed by addition of light, wherein the reduction 
in light occurs for a period selected from the group consisting of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or more 
than 12 hours. In other methods the microbes are of the Scendesmus genus and the stimulus is the addition of 
chromium to the culture media. In some methods the desired phenotype is hydrogen production. In still other 
10 methods, nucleic acid exchange occurs between only two parental cells at a time during the mating process. 

[015] The foregoing description of some preferred embodiments of the invention is not a limiting description of the 
invention, and many other embodiments of the invention are described herein. 

BRIEF DESCRIPTION OF THE DRAWINGS 
15 [016] Figure 1 demonstrates the method of subjecting homologous genes cloned from different microbes capable of 
producing hydrogen to Dnase I digestion in preparation for DNA shuffling procedures. 

[017] Figure 2 demonstrates the construction of a library of shuffled sequences. Dnase I digested fragments are 
annealed to chimeric oligonucleotides that contain sequences corresponding to the N and C terminal ends of the 
coding regions of the shuffled genes as well as linker sequences referred to as "unique sequences" that are present at 

20 both ends of each fragment after annealing and primerless PCR. 

[018] Figure 3 demonstrates the denaturation, annealing, and primerless PCR of DNA fragments containing 
different elements of a DNA construct used to transform cells. Denatured fragments anneal through unique 
sequences to other fragments. The shuffled library of coding regions of shuffled differentially regulated genes is 
flanked by unique sequences that anneal to promoter and transcriptional terminator sequences. 

25 [019] Figure 4 depicts a map of the DNA constructs described in Example 1, with details demonstrating the 
annealing points of each shuffled library to flanking nonshuffled segments during construction. 
[020] Figure 5 depicts a map of the DNA constructs described in Example 1. 

[021] Figure 6 depicts a detailed map of the DNA constructs described in Example 1, including the relative 
30 positions of PCR primers and chimeric oligonucleotides. The map is not necessarily drawn to scale. 

[022] Figure 7 depicts a detailed map of the DNA constructs described in Example 2, including the relative 

positions of PCR primers and chimeric oligonucleotides. The map is not necessarily drawn to scale. 

[023] Figure 8 depicts a screening system for use with liquid culture-containing multiwell plates. 

[024] Figure 9 depicts amino acid residues in and near the gas channel of the Clostridium pasteurianum iron 
35 hydrogenase from the structure lfeh in the Protein Data Bank. The amino acid positions from the Clostridium 

pasteurianum iron hydrogenase are shown in italics, while the corresponding amino acid positions from a 

Chlamydomonas reinhardtii iron hydrogenase are shown above in non-italicized font, both according to the 

numbering from Figure 4 of Happe, Eur J Biochem (2002) Feb;269(3): 1022-32. 

[025] Figure 10 depicts the codon usage table of C. reinhardtii. Most preferred codons are shown underlined and in 
40 bold-face type. Any cDNA sequence can be recoded for maximal expression in C. reinhardtii by substituting non- 
preffered codons for most preferred codons. Codon usage tables for microbes can be found at 

3 
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http://www.kazusa.or.jp/codon/. 

[026] Figure 1 1 depicts the mating of two C. reinhardtii cells. Genetic alterations on cognate chromosomes that 
each increase hydrogen production can cosegregate in a progeny cell through a recombination event. Such progeny 
can produce more hydrogen than parental strains. 
5 [027] Figure 12 depicts multiparental mating of four strains of C. reinhardtii. Each of the four strains has a genetic 
alteration that increases hydrogen production. The multiparental mating reaction proceeds through at least two 
cycles of nitrogen deprivation and germination. All four genetic alterations can cosegregate in a progeny cell. Such 
progeny can produce more hydrogen than either parent strain in any of the matings that occur in the multiparental 
mating reaction. 

10 [028] Figures 13-14 depict a gene reassembly protocol for incorporating segments of diverse Iron 
hydrogenaserogenases into the overall framework of a single Iron hydrogenaserogenase. In this example, a C. 
reinhardtii Iron hydrogenaserogenase gene provides the single stranded framework. The design of the protocol 
allows framework/hinge regions to be retained while architecture of the gas channel is altered compared to the C. 
reinhardtii Iron hydrogenaserogenase. 

15 [029] Figure 15 shows the key to the identity of the amino acids of step 1 of figure 13 and the corresponding 
identity of codons in nucleic acids in steps 2-9 of figures 13-14. 

[030] Figure 16 shows the divergent sequences from SEQ ID Nos: 1-112 that correspond to the segments of Iron 
hydrogenaserogenases that line the gas channel. These are the segments that are schematically depicted in figure 
13, step 1. The sequences are used to design the oligonucleotides in step 2 of figure 13. 
20 [031] Figure 17 shows one example of how gas channel segments from SEQ ID Nos: 1-1 12 are reverse translated 
into recoded nucleotide sequence. C. reinhardtii flanking sequence is added to each side of the oligonucleotide 
sequence to ensure adequate annealing. Although step 1 of figure 13 depicts 3 segments, which figure 16 shows 
only 2 segments, the X 1 X 2 X 3 FX 4 X 5 X 6 GGVMEAAX 7 R segment is broken into two distinct segments to allow 
greater combinatorial diversity af the library, as this figure shows. 

25 

DETAILED DESCRIPTION OF THE INVENTION 

[032] All publications, patents, patent applications, and other references cited are fully incorporated by reference 
for all purposes. 

30 

[033] Definitions : The following definitions are intended to convey the intended meaning of terms used 
throughout the specification and claims, however they are not limiting in the sense that minor or trivial differences 
fall within their scope. 

[034] "Differential expression profile" means information about the activity of at least one gene or the presence or 
35 activity of at least one protein in a cell when the cell is exposed to at least two different environmental conditions or 
chemical environments. Literally any difference in the conditions that the cell might be exposed to can cause a 
difference in the expression of one or more genes or the presence or activity of one or more proteins. 
[035] "Conditions more conducive to the generation of hydrogen" means any set of conditions under which a cell 
generates hydrogen. 

40 [036] "Conditions more conducive to the generation of hydrogen" also means, in an experiment intended to 
generate a differential expression profile, conditions under which a cell that already generates a measurable amount 

4 
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of hydrogen under a first set of conditions generates, under a second set of conditions distinct from the first set, a 

measurably greater amount of hydrogen than it does under the first set of conditions. 

[037] "Conditions less conducive to the generation of hydrogen" means any set of conditions under which a cell 
either generates no measurable amount of hydrogen or generates measurably less hydrogen than under conditions 
5 more conducive to the generation of hydrogen. Specifically, conditions more conducive to the generation of 
hydrogen cause a cell to generate a measurable amount of hydrogen while conditions less conducive to the 
generation of hydrogen cause a cell to generate either no hydrogen or measurably less hydrogen than the conditions 
f more conducive to the generation of hydrogen in that same experiment. When cells are cultured under conditions 
less conducive to the generation of hydrogen yet produce a measurable amount of hydrogen, that measurable 

10 amount of hydrogen is less than the amount of hydrogen produced by cells cultured under conditions more 
conducive to the generation of hydrogen in order to produce a differential expression profile. In terms of measuring 
the amount of hydrogen produced, a greater amount of hydrogen produced by a cell under one condition compared 
to another condition is determined by measuring production of hydrogen over a given time interval, 
[038] "Conditions not conducive to the generation of hydrogen" means any set of conditions under which a cell 

15 does not generate a measurable amount of hydrogen. 

[039] "Culture conditions" and "conditions" means the plurality of variables that are manipulated when culturing 
microbes, including but not limited to exposure to light or certain wavelengths of light, exposure to certain 
molecules, nutrients, elements, and the like in culture media as well as exposure to different concentrations of these 
molecules, elements, nutrients, and the like, temperature, placement in darkness or partial darkness, exposure to 

20 other microbes or viruses, as well as any other variable that is manipulated when culturing microbes. 

[040] "Differentially regulated" means where the activity of a gene or a protein in a cell is in some way different 
under one set of culture conditions than under a different set of culture conditions. For instance, Chlamydomonas 
cells express certain genes in higher amounts during the first hour of anaerobic culturing in the dark as compared to 
culturing in the presence of oxygen and illumination. Even though certain genes are expressed in both culture 

25 conditions, if the genes are expressed at different levels between the two conditions they are differentially regulated. 
[041] "Mutagenized nucleic acid sequence" means a nucleic acid sequence in which the nucleotide sequence of the 
mutagenized nucleic acid sequence differs from a starting sequence prior to mutagenesis by at least one base pair. 
For instance, a single nucleic acid sequence is amplified using error-prone PCR to generate a library of nucleic acid 
sequences that are similar in sequence to the starting sequence but differ by at least one base pair, and are therefore 

30 mutagenized nucleic acid sequences. Alternatively, a plurality of nucleic acid sequences that have significant 
sequence identity are put through a gene reassembly process to generate mutagenized nucleic acid sequences. 
Mutagenized nucleic acid sequences are derived from the full or partial sequence of at least one wild type sequence, 
also referred to as a starting sequence. In gene reassembly processes the starting sequences are the parental genes in 
non-recombined form. Mutagenized nucleic acid sequences can also be generated by chemical mutagenesis of 

35 living cells using carcinogens such as nitrosoguanidine (NTG). 

[042] "Significant sequence identity" means at least 40%, preferably 50%, more preferably 60% and more 
preferably 70%, and even more preferably 80% or 90% or higher nucleotide sequence identity when compared 
using a standard sequence comparison such as the BLAST program available at www.ncbi.nlm.nih.gov . 
utagenized nucleic acid sequences can also be generated using standard site-directed mutagenesis protocols 

40 (Maniatis et al. (1989) Molecular Cloning : A Laboratory Manual Cold Spring Harbor Laboratory). 

[043] "Downregulated" means, when relating to a gene, when a gene is transcribed less per unit time or when a 
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gene's corresponding RNA is translated less times per unit time than it was when compared to the level of 
transcription or translation previously. "Downregulated" means, when relating to a protein, when the protein's 
activity per unit time is diminished when compared to the level of activity per unit time previously, when the 
protein is degraded at a faster rate, or when the gene encoding the protein is transcribed less per unit time or is 
5 translated less times per unit time than it was when compared to the level of transcription or translation previously. 
[044] "Upregulated" means, when relating to a gene, when a gene is transcribed or when a gene's corresponding 
RNA is translated more times per unit time than it was when compared to the level of transcription or translation 
previously. "Upregulated" means, when relating to a protein, when the protein's activity per unit time is increased 
when compared to the level of activity per unit time previously, when a protein is degraded at a slower rate, or when 
10 the gene encoding the protein is transcribed more per unit time or is translated more times per unit time than it was 
when compared to the level of transcription or translation previously. 

[045] "Shuffling" means recombining a First nucleic acid with at least one other nucleic acid distinct in sequence 
from the first nucleic acid, wherein the first nucleic acid and the at least one other nucleic acid recombine through 
sequence-specific annealing with each other or to a third nucleic acid. Shuffling is also referred to as gene 
1 5 reassembly. 

[046] "Site-directed mutagenesis" means generating a desired gene sequence that differs from the sequence of a 
starting gene, wherein the sequence difference is a specifically designed amino acid insertion, deletion, substitution, 
or combination thereof. 

[047] "Increased amount of hydrogen" means an amount of hydrogen produced by a strain that has been 
20 transformed with a mutagenized nucleic acid sequence that is greater than the amount of hydrogen produced by the 
starting strain that has either not been transformed with the mutagenized nucleic acid sequence or that has been 
transformed using only control or vector sequences. 

[048] A cell "derived through mating" from a distinct cell is a cell that would not exist but for the mating of the 
distinct cell with at least one other cell. For example, a distinct cell has a mutagenized nucleic acid sequence that 
25 causes increased hydrogen production. The distinct cell is mated to another cell, resulting in progeny cells. The 
progeny cells are derived through mating from the first cell. 



DESCRIPTION 

30 Culturing bacteria under conditions more conducive to the generation of hydrogen 

[049] Methods for culturing photosynthetic bacteria under conditions more conducive and less conducive to the 
generation of hydrogen are known (Maness, (2001) Appl Microbiol Biotechnol Dec;57(5-6):751-6; Weaver PF, 
Proceedings of the Fifth Joint US/USSR Conference of the Microbial Enzyme Reactions Project, Jurmala, Latvia, 
USSR (1979) 461-479). Methods for culturing cyanobacteria under conditions more conducive and less conducive 

35 to the generation of hydrogen are known (Masukawa, Appl Microbiol Biotechnol 2002 Apr;58(5):618-24; 
Benneman JR . Proceedings of the 10th World Hydrogen Energy Conference, Cocoa Beach, FL, USA (1994) 
; Papen, Biochimie 1986 Jan;68(l):121-32). Methods for culturing other bacteria such as E. coli under conditions 
more conducive and less conducive to the generation of hydrogen are known (Nandi, J Bacteriol 1985 
Apr; 1 62(1 ):353-60). The culture media may be solid or liquid. 

40 [050] Standard growth media for other types of cells such as bacteria, cyanobacteria, and photosynthetic bacteria 
are known (see Maniatis et al. (1989) Molecular Cloning : A Laboratory Manual Cold Spring Harbor Laboratory; 

6 
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Masukawa, Appl Microbiol Biotechnol 2002 Apr;58(5):6 18-24; and Papen et aL, Biochimie 1986 Jan;68(l):121- 
32; Dzelzkalns, J Bacteriol 1986 Mar;165(3):964-71). Preferably the cells are cultured in liquid media during a 
screening or selection process since a desired strain that is capable of generating large amounts of hydrogen in the 
presence of oxygen is commercially deployed in liquid media. 

5 

Culturing Green Algae under conditions less conducive to the generation of hydrogen 

[051] Green algae such as Chlamydomonas reinhardtii are grown in atmospheric conditions (ie: normal air), with 
or without illumination, according to standard protocols (Harris, (1989) The Chlamydomonas Sourcebook. 
Academic Press, New York; Rochaix J-D et aL (1998) The Molecular Biology of Chloroplasts and Mitochondria in 

10 Chlamydomonas (Advances in Photosynthesis, Vol 7). A culture is grown for any period of time under these 
conditions. Although it is desired to grow the cells overnight to obtain a healthy culture, if the starting cells were 
also grown under any conditions less conducive to the generation of hydrogen the culture need not be grown for a 
long periods of time. All that is necessary is for the cells to be cultured for some amount of time, preferably at least 
5 minutes under conditions less conducive to the generation of hydrogen, before harvesting. More preferably, the 

15 cells are cultured for one or more hours before harvesting. Alternatively, cells are grown and then frozen. The 
exact conditions and duration of culturing are not vitally important, and trivial differences can be incorporated into 
the protocol, as long as the cells were not placed in conditions more conducive to the generation of hydrogen within 
at least about 10 minutes before harvesting. For example, the cells are cultured in Sager's minimal media or TAP 
media in light. 

20 

Culturing green algae under conditions more conducive to the generation of hydrogen 

[052] In one example, green algae such as C. reinhardtii are cultured under conditions in which no sulfur is present 
in the media and atmospheric oxygen is not present in any gas space contacting the media. After about 15 hours 
under such conditions, green algae cells begin producing hydrogen. (Zhang, Planta (2002) Feb ;2 14(4): 552-61; 
25 Melis, Plant Physiol (2000) Jan; 122(1): 127-36). In other methods, cells are provided minimal amounts of sulfur, 
such as between 10 and 50 micromolar sulfur, and under such conditions cells generate hydrogen (Kosourov, 
Biotechnol Bioeng 2002 Jun 30;78(7):731-40). 

[053] Preferably the cells are cultured in liquid media during a screening or selection process since a desired strain 
that is capable of generating large amounts of hydrogen in the presence of oxygen is commercially deployed in 

30 liquid media. In other words, it is desirable to screen or select for cells in the same type of media as will be used for 
commercial hydrogen production. For this reason liquid growth media is preferred. Growth media for 
Chlamydomonas cells, such as Sager's Minimal Media and Hunters Trace Element Media, are described in sources 
such as Harris E., (1989) The Chlamydomonas Sourcebook. Academic Press, New York and Rochaix J-D et al. 
(1998) The Molecular Biology of Chloroplasts and Mitochondria in Chlamydomonas (Advances in Photosynthesis, 

35 Vol 7). These growth media can be made as solid agar or as liquid. Other green algae media can be used, such as 
Tris-Acetate-Phosphate (TAP) media or Sueoka's media, as described in Harris and other sources. Minimal media 
such as Sager's (also known as Sager-Granick) is preferred when the host organism is or can be photoautotrophic 
because it is desirable to evolve microbes to generate hydrogen using only sunlight as energy. Sager's media is an 
example of photoautotrophic growth requiring media. 

40 [054] Any component of the culture media may be manipulated. For example, a selection molecule such as an 
antibiotic is added to the culture media and a corresponding selectable marker gene is incorporated into the 
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transformation vector containing the recoded and recombined hydrogenase library. 

[055] Optionally, other components of the culture media are manipulated such as amount of sulfur in the media. 
The level of sulfur may be increased, decreased, or held constant throughout the period of culture, (see Melis et. al. 
Plant Physiol (2000) Jan; 122(1): 127-36 and Zhang et al. Planta (2002) Feb ;2 14(4) :552-61). 
5 [056] Another component that may be optionally added to the culture media is metronidazole (MNZ). MNZ is a 
strong oxidizer of reduced ferredoxin. Ferredoxin accepts electrons from the Photosystem I complex and transfers 
them to the hydrogenase to supply electrons for the 2H + + 2 e ~ ->H 2 reaction. When MNZ is added to the culture 
media a controlled amount of oxygen is also added to the culture container and cells that survive are assayed for 
hydrogen production. In a typical experiment, C. reinhardtii cells that survive the MNZ treatment protocol; cultured 

10 for example in Saeger's minimal media in 20 mM MNZ; ImM Sodium Azide; 2% oxygen; 200 W/m 2 light for 20 
minutes, with expression of one or more mutagenized nucleic acid sequences, are placed in liquid culture media in 
multiwell plates and assayed for hydrogen production. It is unnecessary to count the number of independent 
transformants that survive the MNZ treatment. Any transformant that survives the treatment is capable of 
producing more hydrogen under a certain level of oxygen than a wild-type cell, and therefore all survivors are 

15 assayed for hydrogen production without regard to the number or percent of mutant survivors. For an example of 
the use of MNZ, see U.S. Patent 5,871,952. 

[057] In one embodiment, cells are cultured in a Tris-acetate-phosphate media, at approximately pH 7.0 (Harris, 
(1989) The Chlamydomonas Sourcebook. Academic Press, New York). The cultures are bubbled with 3% C0 2 in 
air at 25°C. The cultures are continuously illuminated. After at least five minutes of culturing under these 

20 conditions, cells are harvested and are resuspended in the same media as before except for the absence of sulfur. 
The cells are then cultured under continuous illumination. Alternatively, the cells are originally cultured in the 
absence of acetate, but under continuous illumination (ie: photoautotrophically), and are then transferred to media 
that contains an absence of sulfur. Alternatively, culture conditions comprise culturing the cells in media that is 
devoid of sulfur, iron, or manganese, or any combination of these three elements. 

25 [058] In another embodiment, frozen aliquots of green algae are thawed in culture media devoid of sulfur and 
continuously cultured, in the presence of light, for at least five minutes. The cells are then harvested. 
[059] There are other culture conditions for some algae species that are conducive to the generation of hydrogen 
besides the sulfur deprivation method. For instance, blue-green algae produce hydrogen when starved of nitrogen 
(Weissman, Appl Environ Microbiol 1977 Jan;33(l): 123-31). Hydrogen is also generated when green algae are 

30 cultured in the absence of light when the culture is flushed with gases, such as argon, that remove oxygen from the 
media (Happe, Eur J Biochem (2002) Feb ;269(3): 1022-32). 

Generation of a differential expression profile: comparison of RNA between cells cultured in conditions more 
conducive to the generation of hydrogen and cells cultured in conditions less conducive to the generation of 
35 hydrogen 

[060] Once at least two sets of cells are cultured under conditions more conducive and less conducive to the 
generation of hydrogen, RNA samples are extracted from the cells. Methods and protocols for the isolation of RNA 
from bacterial and algae cells are well known in the art (Maniatis et al. (1989) Molecular Cloning : A Laboratory 
Manual Cold Spring Harbor Laboratory; Harris, (1989) The Chlamydomonas Sourcebook. Academic Press, New 
40 York; Rochaix J-D et al. (1998) The Molecular Biology of Chloroplasts and Mitochondria in Chlamydomonas 
(Advances in Photosynthesis, Vol 7). 
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[061] The RNA is isolated from both the cells placed under conditions more conducive to the generation of 
hydrogen as well as cells placed under conditions less conducive to the generation of hydrogen. There is no 
requirement that both sets of cells be grown simultaneously or that RNA be isolated from both sets of cells 
simultaneously. There is also no requirement that the same strain of microbe be used in both culture conditions, 
although it is preferred that they be the same strain. 

[062] After RNA is isolated from the cells, a plurality of methods can be utilized to generate a differential 
expression profile. 

[063] In one embodiment, the RNA is placed on microarrays such as silicon chips or glass slides containing 
sequences corresponding to known sequences from the genome of the cells. It is not necessary that the sequences 
immobilized onto the microarray are derived from the same strain or species of the cells from which RNA are 
isolated as long as the genome of the cells used to make the microarray is somewhat homologous to the genome of 
the cells from which the RNA is isolated. For instance, the cells exposed to conditions more conducive and less 
conducive to the generation of hydrogen are Chlamydomonas fusca while the sequences immobilized on the 
microarrays are Chlamydomonas reinhardtii. Utilizing evolutionarily related strains of microbes for purposes of 
RNA isolation and microarray sequence immobilization provides reliable data, and the methods disclosed herein are 
utilized with a variety of microbes. RNA molecules isolated from cells hybridize with nucleic acid molecules 
immobilized on the microarray to form double stranded RNA duplexes. Such duplexes are detected by a variety of 
methods known in the art (such as the GeneChip® product and associated scanning techniques produced by 
Affymetrix Inc., Santa Clara, CA; Dudley, Proc Natl Acad Sci U S A 2002 May 28;99(ll):7554-9). In one 
embodiment the RNA isolated from cells is amplified by PCR and labeled nucleotides are incorporated into the 
newly synthesized nucleic acid molecules. These molecules are digested with a nuclease, denatured to single 
stranded molecules, and hybridized to the immobilized sequences on the chip. Double stranded duplexes that form 
contain the labeled nucleotides from the PCR reaction in one strand, and these duplexes are visualized. For 
example, the label incorporated into the molecules in the PCR reaction is a fluorescent molecule, and the microarray 
is placed into a fluorescence detection chamber. Such microarray technology is well known in the art. For 
instance, microarrays containing over 2,700 unique genes from C. reinhardtii are commercially available 
(Chlamydomonas Genome Project, Duke University, Durham, N.C.). In addition to the ability to visualize whether 
or not a duplex has formed on a particular spot corresponding to a particular gene on the chip, this technology also 
quantitates the difference in the amount of duplex formed on a given spot between two or more experiments using 
different RNA samples. This differentiation ability allows the identification of differentially regulated genes 
between cells grown in culture conditions more conducive to the generation of hydrogen and less conducive to the 
generation of hydrogen. 

[064] Upon hybridization of the RNA samples from two or more sets of cells, genes that are upregulated or 
downregulated between the two sets of cells are identified. For example, the iron hydrogenase gene in 
Chlamydomonas is turned on when the cells are exposed to conditions more conducive to the generation of 
hydrogen, however the gene is turned off when the cells are exposed to conditions not conducive to the generation 
of hydrogen. When the two RNA samples are placed on microarrays containing immobilized sequences 
corresponding to the genome of C. reinhardtii, a spot on the chip containing the sequence of the iron hydrogenase 
gene contains a duplex of nucleic acid when the RNA sample is isolated from cells exposed to conditions more 
conducive to the generation of hydrogen, whereas the spot does not contain a duplex when the RNA sample is 
isolated from the cells exposed to conditions not conducive to the generation of hydrogen. The C. reinhardtii iron 
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hydrogenase gene is differentially regulated between cells exposed or not exposed to conditions more conducive to 
the generation of hydrogen, and therefore the gene is identified as differentially regulated. 



Generation of a differential expression profile: Suppression Subtractive Hybridization between cells cultured in 
conditions more conducive to the generation of hydrogen and cells cultured in conditions less conducive to the 
generation of hydrogen 

[065] In another embodiment, RNA is isolated from both sets of cells and is put through the Suppression 
Subtractive Hybridization PCR technique (Diatchenko, Proc Natl Acad Sci U S A 1996 Jun ll;93(12):6025-30; 
Happe, Eur J Biochem (2002) Feb;269(3): 1022-32; commercially available kits are provided by Clontech 
Laboratories, Inc., Palo Alto, CA). In this technique transcripts from genes expressed in one sample (in this case 
the cells cultured under conditions more conducive to the generation of hydrogen) but not the other (in this case the 
cells cultured under conditions less or not conducive to the generation of hydrogen) are selectively amplified 
through the PCR method. Genes amplified through this technique are differentially regulated genes. 

Generation of a differential expression profile: Two Dimensional gel electrophoresis between cells cultured in 
conditions more conducive to the generation of hydrogen and cells cultured in conditi ons less conducive to the 
generation of hydrogen 

[066] A differential expression profile is created by subjecting protein samples from both sets of cells to two 
dimensional gel electrophoresis. This technique is well known in the art, and is optionally coupled with mass 
spectrometry techniques to aid in the identification of proteins (Arthur, Kidney Int 2002 Oct;62(4): 1314-21). Spots 
indicating proteins on a gel from cells exposed to conditions more conducive to the generation of hydrogen but not 
present or present in different amounts on a gel from cells exposed to conditions less conducive to the generation of 
hydrogen correspond to proteins encoded by differentially regulated genes. Two dimensional gel electrophoresis 
analysis is advantageous for purposes such as monitoring the content of organelles such as chloroplast or 
multiprotein complexes such asphotosystem I that are involved in the production of hydrogen. (Dreger, Eur J 
Biochem. 2003 Feb;270(4):5 89-99). 

Generation of a differential expression profile: Other Methods : 

[06V] In another embodiment, a differential expression profile is created by analyzing only a single gene or a small 
set of genes through methods such as Northern blotting, Western blotting, or activity assays specific to a protein of 
interest (Maniatis et al. (1989) Molecular Cloning : A Laboratory Manual Cold Spring Harbor Laboratory). A 
plurality of methods, specific to each gene, is employed to assess a difference in the activity of a gene or protein 
between two or more samples of cells exposed to different conditions. Any difference in conditions that a cell is 
exposed to may cause differential activity of some genes and/or proteins, including but not limited to components of 
culture media, temperature, exposure to sunlight or light of varying wavelengths, the presence of specific nutrients 
or elements, exposure to certain molecules, and exposure to other organisms or viruses. 

Identification of differentially regulated genes 

[068] After generation of the differential expression profile, any gene or protein demonstrated to be differentially 
regulated when cells are exposed to conditions more conducive to the generation of hydrogen versus conditions less 
conducive to the generation of hydrogen is a target for engineering efforts. For instance, the iron hydrogenase gene 
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in C. reinhardtii is differentially regulated between conditions more conducive to the generation of hydrogen and 
conditions less conducive to the generation of hydrogen. 

[069] Also provided are methods for the identification of genes and proteins downregulated when cells are exposed 
to conditions more conducive to the generation of hydrogen. Such genes are targets for mutation, deletion from the 
5 genome, or downregulation trirough methods such as RNA interference. Alternatively, molecules capable of 
inhibiting the activity of proteins downregulated when cells are exposed to conditions more conducive to the 
generation of hydrogen are added to the culture in order to stimulate the cells to generate an increased amount of 
hydrogen. 

10 Providing mutagenized nucleic acid sequences corresponding to differentially regulated genes 

[070] Clones of genes identified as differentially regulated are obtained. Creation of full-length cDNA molecules 
is standard in the art (Maniatis et al. (1989) Molecular Cloning : A Laboratory Manual Cold Spring Harbor 
Laboratory), however gene fragments are also used. The gene or gene fragment is mutagenized using one or more 
mutagenesis methods. 

15 [071] In one embodiment, the gene is amplified using error-prone PCR. Error-prone PGR is a standard procedure 
in the art (Leung, Technique (1989) 1, 11-15). In this technique the gene of interest is amplified using a DNA 
polymerase under conditions that are deficient in the fidelity of replication of sequence. The result is that the 
amplification products contain at least one error in the sequence. When a gene is amplified and the resulting 
product(s) of the reaction contain one or more alterations in sequence when compared to the template molecule, the 

20 resulting products are mutagenized as compared to the template. 

[072] Alternatively, the gene of interest is cloned into a suitable vector and used to transform a microbe. The 
microbe is then grown while exposed to a mutagenizing agent such as nitrosoguanidine or ethyl methanesulfonate 
(Nestmann, Mutat Res 1975 Jun;28(3):323-30), and the vector containing the gene is then isolated from the host. 
[073] In one embodiment, the gene identified as upregulated is mutagenized through gene reassembly, saturation 

25 mutagenesis, or other directed evolution techniques. These techniques are known in the art (U.S. Patent 5,605,793, 
U.S. Patent 5,830,721, U.S. Patent 6,165,793, U.S. Patent 6,180,406, U.S. Patent 5,939,250, U.S. Patent 6,171,820, 
U.S. Patent 6,361,974, U.S. Patent 6,358,709, U.S. Patent 6,352,842, U.S. Patent 6,238,884, U.S. Patent 6,420,175, 
U.S. patent 6,287,861 and related patents; Coco et al., Nat Biotechnol 2001 Apr,19(4):354-9). 
[074] It is preferable but not necessary that nucleic acid molecules used in shuffling protocols use the same codon 

30 to encode each individual amino acid. For example, even though 6 different amino acids encode Arginine, only 
CGC is used. It is also preferable that the codon used to encode each amino acid is the most preferred codon in an 
organism that is transformed with the shuffled sequences. Using only one codon that is the most preferred codon in 
die organism is preferred because it allows the nucleic acid fragments to anneal better because they have higher 
nucleotide sequence identity. In addition, every protein encoded by a shuffled sequence is translated at equal 

35 efficiency by the organism. In one embodiment, the organism is C. reinhardtii, at least nucleic acid molecule 
encoding one segment of a protein from SEQ ID NOs: 1-112 is used in a shuffling protocol, and the nucleic acid 
molecules that are used in the shuffling protocol use only the most preferred codon from C. reinhardtii as depicted 
in figure 10. 

[075] In one embodiment, the differentially regulated gene is digested with a nuclease such as Dnase I to form 
40 random fragments. These fragments are mixed with similarly digested fragments of at least one other gene that 
contains some sequence homology to the differentially regulated gene. Alternatively the fragments are pooled with 
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synthetic single or double stranded oligonucleotides corresponding to sequences from genes possessing homology 
or partial homology to the differentially regulated gene. The mixed fragments are denatured to form single stranded 
molecules and the molecules are then allowed to anneal to each other. The fragments are put through an extension 
protocol such as primerless PGR in which 3' ends of fragments are extended through the use of a DNA polymerase 
5 enzyme. The resulting mixture contains a library of shuffled sequences that are used to transform cells for 
screening or selection procedures. 

[076] In one embodiment genes that are homologous to genes that are (a) identified as differentially regulated and 
(b) are further identified as upregulated when cells are exposed to conditions more conducive to the generation of 
hydrogen are isolated from evolutionarily similar microbes. For example, the iron hydrogenase gene is upregulated 

10 in C. reinhardtii when the cells are exposed to conditions more conducive to the generation of hydrogen. Other iron 
hydrogenase genes are isolated from microbes that are evolutionarily related and/or are known to possess an iron 
hydrogenase gene. For sequences of genes homologous to the gene identified as differentially regulated that are 
already known, gene fragments corresponding to these genes may be chemically synthesized using known sequence 
information; it is not necessary that such genes be actually cloned from their natural source in order to be utilized in 

15 shuffling experiments. Examples of such known iron hydrogenase genes include those listed in the sequence listing. 
[077] In one embodiment, nucleic acid fragment encoding proteins sequences of at least 5 amino acids are used in 
shuffling experiments. Alternatively, the fragments encode at least 6 amino acids, and in some instances at least 8 9, 
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 or more amino acids. 

[078] These genes are isolated through procedures known in the art. For instance, the C. reinhardtii iron 
20 hydrogenase gene is used as a probe to screen cDNA or genomic DNA libraries of other green algae. In particular, 
the highly conserved "H-cluster" sequence corresponding to the active site of iron hydrogenases is used as a probe 
(Peters, Science (1998) Dec 4 ;282(5 395): 1853-8, Nicolet, Structure Fold Des (1999) Jan 15;7(l):13-23). 
Alternatively, PGR primers corresponding to sequences from the C. reinhardtii iron hydrogenase gene are used to 
amplify iron hydrogenase genes from other microbial genomes. In this method the PGR template is genomic DNA, 
25 a cDNA library, or RNA for use in RT-PCR. The sequences isolated from each microbe are mixed and put through 
a shuffling procedure. 

[079] In one embodiment, a plurality of genes is identified from the differential expression profile as upregulated 
when C. reinhardtii cells are exposed to conditions more conducive to the generation of hydrogen. Sequence 
information from these genes is used to generate probes and PGR primers corresponding to the sequences. A 

30 plurality of green algae species, originally isolated from disparate geographic locations, are cultured under 
conditions more conducive to the generation of hydrogen. A cDNA library from each green algae species is 
generated and utilized for the isolation of sequences corresponding to each of the sequences identified from C. 
reinhardtii as differentially regulated using the probes corresponding to the upregulated C. reinhardtii sequences. 
The isolated gene sequences are used for shuffling. 

35 [080] In one embodiment, the plurality of genes is shuffled in reactions containing synthetic chimeric 
oligonucleotides. The chimeric oligonucleotides possess on one end sequence corresponding to either the 5' or 3 s 
end of the coding region of genes included in the shuffling reaction. On the other end these chimeric 
oligonucleotides contain heterologous sequence, such as unique sequences not found in the genes that are shuffled 
or in the genome of the hydrogen producing microbe. The unique sequences are used to connect different 

40 components of DNA constructs containing mutagenized nucleic acid sequences (Figure 3). Other chimeric 
oligonucleotides contain sequences corresponding to (a) a promoter sequence and (b) a unique sequence. The sense 
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and antisense strands of unique sequences are used to join mutagenized nucleic acid sequences with promoter 
sequences and other types of sequence heterologous to the mutagenized nucleic acid sequences. For example, a 
promoter sequence imparts transcriptional activation to a downstream mutagenized nucleic acid sequence when 
placed in a Chlamydomonas cell that is exposed to light (Hahn, Curr Genet (1999) Jan;34(6):459-66; Loppes, Plant 
Mol Biol 2001 Jan;45(2):215-27; Villand, Biochem J 1997 Oct 1;327 ( Pt l):51-7). Other light-inducible promoter 
systems may also be used, such as the phytochrome/PIF3 system (Shimizu-Sato, Nat Biotechnol 2002 
Oct;20(10):1041-4). Alternatively or in addition, the promoter sequence imparts transcriptional activation to a 
downstream gene when placed in a Chlamydomonas cell that is exposed to light and heat (Muller, Gene (1992) Feb 
15;lll(2):165-73; von Gromoff, Mol Cell Biol (1989) Sep;9(9):3911-8). Alternatively the promoter sequence 
imparts transcriptional activation to a downstream gene when an exogenous molecule is added to the culture media 
using receptors not present in the wild-type cell such as receptors for estrogen, ecdysone, or others (Metzger, Nature 
1988 Jul 7;334(6177):31-6; No, Proc Natl Acad Sci U S A 1996 Apr 16;93(8):3346-51). Alternatively the 
promoter sequence imparts transcriptional activation in a constitutive fashion, such as the promoter of the psaD 
gene (Fischer, WO 01/48185). When the shuffled gene fragments are annealed and subjected to primerless PCR, 
the 5' and 3' ends of the shuffled coding regions anneal to chimeric oligonucleotides that in turn anneal to other 
heterologous sequences such as promoters and 3' untranslated regions that enhance expression levels (Lumbreras, 
Plant J (1998) 14(4): 441-447), The 5' end of every coding sequence created through the shuffling procedure is 
annealed to a chimeric oligonucleotide corresponding to a unique sequence. The unique sequence in turn anneals to 
a nonshuffled segment of DNA containing a promoter sequence (Figures 3, 4). Unique sequences are thus used to 
attach components of DNA constructs to each other that do not possess sequence homology. In addition, chimeric 
oligonucleotides are included that possess homology to internal parts of the coding region of shuffled genes as well 
as intron sequences to direct the insertion of intron sequences into coding regions to aid in effective expression 
levels (Lumbreras, Plant J (1998) 14(4): 441-447). 

[081] Chimeric oligonucleotides may be used to connect any part of a nucleic acid construct to another in shuffling 
protocols. Intron, transcriptional terminator, splice sequences, centromeres, selectable and screenable markers are 
all introduced into nucleic acid constructs through annealing these elements to chimeric oligonucleotides that 
contain heterologous sequence, followed by promoterless PCR protocols. 

[082] In one embodiment, libraries of individually shuffled homologous genes with unique sequences at each end 
are mixed with other distinct libraries of individually shuffled homologous genes that also contain unique sequences 
at both 5' and 3' ends. Also mixed with the shuffled libraries of coding sequences are nonshuffled segments 
containing structural and functional DNA elements such as promoters, 3' untranslated regions, and screenable or 
selectable markers. The nonshuffled segments of DNA are also flanked with unique sequences, all of which are 
identical to unique sequences flanking certain shuffled sequences. All of the molecules are denatured, annealed, 
and subjected to a primerless PCR reaction in which "sense" and "antisense" unique sequences anneal to each other 
and prime extension by a polymerase, thus placing each shuffled and nonshuffled sequence into its desired place on 
the resulting DNA construct. The resulting library of DNA constructs contains shuffled genes operatively linked to 
promoter sequences. (Figures 3, 4) 

[083] In one embodiment chimeric oligonucleotides contain sequence corresponding to genes being shuffled and 
heterologous sequence corresponding to introns, splice sequences, centromeres, selectable markers, unique 
sequences or other linker sequences designed to serve as structural parts of the construct. The design of the DNA 
construct using these chimeric oligonucleotides creates a functional DNA construct directly from the shuffling 
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procedure. Any desired component of a DNA construct is included through the use of chimeric oligonucleotides 
that connect heterologous sequences of the construct during the annealing step. For instance, the inclusion of a 
light-inducible promoter allows the shuffled versions of differentially regulated genes to be activated by light rather 
than the conditions more conducive to the generation of hydrogen. 
5 [084] In one embodiment each DNA construct in the library of DNA constructs contains a plurality of shuffled 
genes that possess sequence homology to a set of upregulated differentially regulated genes. Each coding region 
has an upstream light-inducible promoter and a downstream untranslated transcriptional terminator sequence. Each 
coding region contains an intron and functional splice sequences. Each construct contains at least one selectable 
marker. Constructs optionally also contain other functional or structural sequences. For example, centromeres or 
10 other sequences employed for the purpose of allowing the construct to be retained in dividing cells and/or sequences 
that aid in integration of the construct into random or specific regions of the host genome are included in the 
construct. In other embodiments the promoter is constitutive or is inducible by a stimulus other than light, such as 
the addition of a small molecule to the culture media. 

[085] In one embodiment, DNA constructs are used to turn off or downregulate the expression of differentially 
15 regulated genes that are downregulated when cells are exposed to conditions more conducive to the generation of 
hydrogen. These constructs work through the use of antisense and/or RNA interference methods. In this 
embodiment, a DNA construct containing at least one antisense sequence operatively linked to a promoter is used to 
transform cells for the purpose of downregulating the expression of a gene or genes that are naturally 
downregulated when cells are exposed to conditions more conducive to the generation of hydrogen. For example, 
20 in Chlamydomonas, antisense inhibition is utilized to effect a drop in expression of the targeted gene (Schroda, 
Plant Cell (1999) Jun;l 1(6): 1165-78). Alternatively, an RNA interference (RNAi) construct is used (Fire, Nature 
(1998) Feb 1 9 ;39 1(6669): 806- 11 ; Fuhrmann, J Cell Sci (2001) Nov;114(Pt 21):3857-63). In one embodiment, 
DNA constructs are synthesized that contain shuffled sequences corresponding to genes upregulated when cells are 
exposed to conditions more conducive to the generation of hydrogen and RNAi sequences corresponding to genes 
25 downregulated when cells are exposed to conditions conducive to the generation of hydrogen. Both the shuffled 
sequences and the RNAi sequences are functionally coupled to promoters that are activated by the same stimuli, 
different stimuli, or are constitutively active. 

[086] In one embodiment genes downregulated when cells are exposed to conditions less conducive to the 
generation of hydrogen are removed from the genome through gene targeting methods that utilize homologous 
30 recombination (Naver, Plant Cell 2001 Dec; 13(12):273 1-45). 

[087] In one embodiment molecules that interfere with the function of proteins that are encoded by genes 
downregulated when cells are exposed to conditions more conducive to the generation of hydrogen are either placed 
in the culture media or synthesized by proteins encoded by transgenes inserted into cells. 

[088] In one embodiment the DNA constructs containing shuffled upregulated differentially regulated genes 
35 contain genes encoding screenable or selectable markers at each end of a linear DNA construct. For example, at 
one end of the construct is a gene encoding a fluorescent protein optimized for use in Chlamydomonas (Fuhrmann, 
Plant J (1999) Aug;19(3):353-61). At the other end is a gene encoding a selectable marker gene that imparts 
resistance to an antibiotic (Stevens, Mol Gen Genet (1996) Apr 24;251(l):23-30 ). Between the fluorescent protein 
and the antibiotic resistance gene are shuffled versions of genes upregulated when cells are exposed to conditions 
40 more conducive to the generation of hydrogen or are involved in the hydrogen production pathway, such as 
ferredoxin, catalase, isoamylase, malate dehydrogenase, 14-3-3 protein, enolase, aldolase, ribosomal protein S8, 
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ribosomal protein L17, ribosomal protein SI 8, ribosomal protein L37, ribosomal protein L12, ribosomal protein 
S15, iron-hydrogenase, and components of the photosystem I, photosystem II and cytochrome b 6 -f complexes. 
Components of the photosystem I and II complexes are disclosed, for example, in Elrad, Curr Genet. 2003 Dec 2. 
Hydrogen can be produced in C reinhardtii for example, by pathways that opetare in light and dark. Mutagenized 
genes from either pathway can be assayed using the methods disclosed herein. Cells are transformed with the 
library of constructs and are cultured in media containing the antibiotic. Cells that survive under these culture 
conditions are run through a fluorescence activated cell sorter that plates each cell expressing the green fluorescent 
protein onto a grid pattern on solid media or into multiwell plates containing liquid growth media containing the 
antibiotic. Colonies are screened or selected for the ability to generate an increased amount of hydrogen. Cells that 
retain both markers have also retained all the sequence in the DNA construct between the two markers. Large 
numbers of genes may be placed between the two markers. Preferably only cells that retain both markers are put 
through screening or selection procedures. 

[089] In one embodiment the mutagenized nucleic acid sequence encodes an iron hydrogenase protein and the cell 
is a green algae species such as C. reinhardtii. Further, the mutagenized nucleic acid sequence is generated by 
mutagenizing a C. reinhardtii iron hydrogenase gene at at least one amino acid position. The mutagenized nucleic 
acid sequence is used in a construct to transform the cell. Preferably, the iron hydrogenase protein retains the 
capacity to functionally interact with a ferredoxin or other electron donor in the cell. "Functionally interact" means 
that a ferredoxin or other electron donor transfers electrons to the hydrogenase protein. Preferably the sequence 
change(s) caused by the mutagenesis of the C. reinhardtii iron hydrogenase gene does not disrupt the functional 
interaction between the protein encoded by the mutagenized C. reinhardtii iron hydrogenase gene and ferredoxin or 
another electron donor. Preferably the mutagenesis creates an oxygen tolerance phenotype without disrupting the 
functional interaction with a ferredoxin. More preferably, the mutagenesis creates an oxygen tolerance phenotype 
while enhancing the functional interaction with a ferredoxin. An example of an enhanced functional interaction 
with ferredoxin is a functional interaction that allows more electrons to be shuttled from the endogenous ferredoxin 
to the mutagenized iron hydrogenase per unit time under than with the non-mutagenized C. reinhardtii iron 
hydrogenase. An enhanced functional interaction can also be screened or selected for by mutagenizing the 
ferredoxin, as described in Example 2. 

Providing mutagenized nucleic acid sequences corresponding to genes known to be involved in a hy drogen 
production pathway 

[090] Wild type iron hydrogenase genes are preferred mutagenesis targets with which to generate mutagenized 
nucleic acid sequences. Mutagenesis preferably alters characteristics such as oxygen tolerance while not altering 
characteristics such as the ability to functionally interact with ferredoxin. 

[091] In one embodiment, the C. reinhardtii iron hydrogenase gene is mutated to alter amino acid residues in and 
near the gas channel. The gas channel is a section of iron hydrogenases, depicted in figure 9, that allows newly 
formed hydrogen molecules to leave the protein. Oxygen irreversibly inactivates the active site of iron 
hydrogenases by entering the active site through the gas channel (for background see Ghirardi, Appl Biochem 
Biotechnol (1997) 63-65: 141-151). Because hydrogen molecules are smaller than oxygen molecules, narrowing the 
gas channel using methods deiclosed herein provides iron hydrogenases that are not inactivated by oxygen. 
Preferably, substitutions of residues that are in and near the gas channel generate side chains that are of higher 
molecular weight or are longer than the side chain at that position in the wild type protein. Such substitutions are 
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preferable because they narrow the gas channel and block the entry of oxygen into the active site. As one 
nonlimiting example, residues in the highly conserved X I X 2 X 3 FX 4 X 5 X 6 GGVMEAAX 7 R segment can be mutated. 
This segment forms a turn followed by an alpha helix. The F corresponds to Phe234 in the wild type C. reinhardtii 
iron hydrogenase. The X residues are highly variable between iron hydrogenase from different species. For 
5 example, the X 4 X 5 X 6 residues are GVT, GAT, GVS, GNS, CAS, and numerous other sequences in different iron 
hydrogenases. Nonetheless, members of the iron hydrogenase family usually have a G as the first residue of this 
triplet. Although the GGVMEAA amino acid motif is highly conserved among members of the iron hydrogenase 
family, there are some iron hydrogenases that have variant sequences corresponding to this motif. For example, the 
D. fructosovorans iron hydrogenase (GenBank Accession number D57150) has the sequence GGVIEAA. Thus, 

10 even highly conserved motifs that surround the gas channel are tolerant of change. 

[092] Other amino acid motifs also form secondary structures near the gas channel. For example, the 
ADX 8 TIX 9 EE motif is in close contact with the channel. In particular, the T, I and X 9 residues are near the channel. 
[093] In one embodiment, highly variable amino acids are subjected to saturation mutagenesis. In another 
embodiment, highly variable amino acids are substituted with any amino acid that is of a higher molecular weight 

15 hat the wild type amino acid at that position in either of the C. reinhardtii iron hydrogenases. In another 
embodiment, variable amino acids in either of the C. reinhardtii iron hydrogenases are substituted with amino acids 
that are found in the corresponding position in iron hydrogenases from different species. In yet another 
embodiment, the X 1 X 2 X 3 FX 4 X 5 X 6 GGVMEAAX 7 R motif is mutated in either of the C. reinhardtii iron 
hydrogenases referred to as hydA and hydB (Forestier, Eur J Biochem. 2003 Jul;270(13):2750-8), wherein some of 

20 the X residues are substituted with amino acids that are found in the corresponding position in iron hydrogenases 
from different species while other X residues are substituted with residues that are no found in any known species. 
In one embodiment residues X l X 2 X 3 are from species 1, residues X 4 X 5 X 6 are from species 2, and residue X 7 is from 
species 3, where these X residues are placed in the context of a C. reinhardtii iron hydrogenase protein, and where 
none of species 1, 2, or 3 is C. reinhardtii. The methods provided herein include mutagenizing genes by substituting 

25 any segment of a protein sequence into another protein sequence, including genes encoding iron and nickel-iron 
hydrogenase proteins. Preferable lengths for segments include 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more 
amino acids. Of course, the methods provided also included substituting single amino acids from one species into 
the proteins of another species at a particular position as well as substituting amino acids that do not correspond to 
amino acids of another species at a particular position. 

30 [094] In another embodiment, gene reassembly of the iron hydrogenase is performed. Sections of the C. reinhardtii 
iron hydrogenase active site region that are both highly conserved and correspond to the gas channel are used to 
construct a library of iron hydrogenase genes, depicted schematically in figure 13. In step 1, the library of iron 
hydrogenase amino acid sequences from SEQ ID NOs: 1-112 was aligned using sequence manipulation software 
(DS Gene, Accelyrys Inc., San Diego, CA). The key in figure 15 shows the identity of amino acids from step 1 and 

35 codons from steps 2-9. All bars in steps 2-9 correspond to codons that encode the amino acids from the bars of step 
1. Each bar in steps 2-9 therefore depicts a codon triplet of oligonucleotide sequence. In step 2, conserved amino 
acid segments were identified in the alignment and reverse-translated into single stranded oligonucleotide sequences 
utilizing C. reinhardtii most preferred codons. In step 3, 3 codons encoding amino acids flanking these highly 
conserved gas channel sequences were re-written as the C. reinhardtii flanking sequence of the oligonucleotides. 

40 Even though these oligonucleotides encode different gas channel segments from the C, reinhardtii iron 
hydrogenase, the combination of the recoding process and the substitution of3 flanking C. reinhardtii codons 
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generates enough nucleotide similarity that these oligonucleotides anneal to a complementary strand encoding the 
recoded, wild-type C. reinhardtii iron hydrogenase. In step 4, the set of recoded oligonucleotides corresponding to 
diverse gas channel segments are annealed to a single stranded DNA molecule that encode C. reinhardtii Iron 
hydrogenase protein using the same C. reinhardtii most preferred codons. In addition, oligonucleotides 
corresponding to wild type C. reinhardtii amino acid sequences with single residue substitutions designed to narrow 
the gas channel can also be included in the annealing reaction. A C. reinhardtii C-terminal primer is also added to 
the annealing reaction. The single stranded molecule is generated by isolating the gene from a plasmid grown in a 
methylating host cell, followed by denaturation and separation of the strands by HPLC or other standard 
procedures, as described for example in U.S. patent 6,361,974. As shown in step 5 of figure 14, different 
combinations of segments anneal to each full length complementary strand. Addition of DNA Polymerase in step 6 
extends the annealed oligonucleotides, creating a library of double stranded hybrid molecules with mismatches at 
"context" residue positions. Preferably the DNA Polymerase is exonuclease-deficient to prevent it from degrading 
parts of annealed primers in its path as it extends between annealed primers. In step 7, the methylated strands are 
digested using a methylation-sensitive endonuclease, as described for example in U.S. patent 6,361,974. In steps 8- 
9, N-terrninal C. reinhardtii primer and DNA Polymerase are added to the library of novel iron hydrogenase 
molecules. As an alternative to methylation, the C-terminal primer shown first in step 4 can be biotinylated, and the 
mismatched wild type and library strands can be separated in step 7 by denaturation and separation using 
immobilized streptavidin. 

[095] The result of the above process is a library of double stranded iron hydrogenase sequences that have random 
combinations of functional gas channel segments and C. reinhardtii framework/hinge regions. The population is 
cloned into C. reinhardtii cells and assayed as described in previous sections. This method does not use an 
exonuclease such as mung bean nuclease. No single stranded fragments that anneal to the methylated strand have 
partially overlapping binding sites. The advantage of this method of creating mutagenized nucleic acid sequences is 
that the library can be tested for oxygen tolerance but preserves C. reinhardtii framework/hinge domains that 
functionally interact with ferredoxin than a library made using other gene reassembly procedures such as the 
procedure shown in figures 2-3 that involves reassembly of the entire gene sequence. In a preferred embodiment, 
single stranded nucleotide molecules, using C. reinhardtii most preferred codons, encoding segments or fragments 
of segments depicted in figure 16 are used in trie procedure. Although figure 17 depicts one possible arrangement 
of three diverse oligonucleotides that can be annealed to a single stranded wild type sequence, mixing 
oligonucleotides corresponding to each of the identified gas channel segments from SEQ ID Nos: 124-147 that have 
C. reinhardtii flanking codons produces a large number of possible combinations of library sequences. Each 
possible combination corresponds to a different gas channel architecture that can be tested for the ability to allow 
flow of hydrogen but not oxygen. 

[096] Alternatively, other genes involved in a hydrogen production pathway are mutagenized. Examples of these 
genes are recited elsewhere in this application. As one example, genes encoding light antenna complexes are 
mutagenized and inserted into cells. For example, one or more genes from a light harvesting complex of C. 
reinhardtii, such as those disclosed in Teramoto, Plant Cell Physiol. 2001 Aug;42(8):849-56. (corresponding to 
GenBank accession numbers M24072, AF104630, AF104631, AB050007, X65119), and Elrad, Curr Genet. 2003 
Dec 2 (lhcbml, lhcbm2, lhcbm3, lhcbm4, lhcbm5, lhcbm6, lhcbm8, lhcbm9, lhcbmll, lhcal, Ihca2,lhca3, lhca4, 
lhcaS, lhca6, lhca7, lhca8, lhca9, lhcb4, lhcb5, lhcq, 11818-111818-2, elipl, elip2, elip3, elip4, and elip5) are 
mutagenized and used to transform C. reinhardtii. Transformants are screened or selected for the ability to produce 
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an increased amount of hydrogen under conditions such as high light, low light, sunlight, or light of a certain 
wavelength range. For example, segments of amino acids from antenna proteins of one species are inserted into 
antenna proteins from C. reinhardtii. The mutagenized nucleic acid sequence is then inserted into C. reinhardtii eels 
and the transformed cells are screened or selected for the ability to live and/or produce hydrogen in the presence of 
photoautotrophic media and light. In one embodiment the light is of a wavelength that wild type C. reinhardtii 
antenna proteins are not capable of harvesting. 

[097] In another embodiment, an siRNA construct is used to transform a cell, where the siRNA construct is 
designed to reduce or eliminate the expression of a gene that reduces the photosynthetic efficiency or rate. For 
example, the C. reinhardtii lhcbml gene is reduced or eliminated in expression using siRNA (sequence of lhcbml in 
Elrad, Plant Cell. 2002 Aug;14(8): 1801-16). 

[098] In one embodiment, cell transformed with mutagenized antenna genes are cultured in the presence of light 
outside the normal wavelength range of the starting strain. For example, genes encoding purple bacteria antenna 
complexes are transformed into green algae such as C. reinhardtii. The genes include preferably only the most 
preferred codon of C. reinhardtii for each amino acid. Preferably, bacteriochlorophyll molecules are present in the 
cells, either synthesized by enzymes also present in the C. reinhardtii cell or added exogenously to the culture 
media. The cells are cultured in photoautotrophic media under light of wavelengths that wild type green algae are 
not capable of capturing, such as 770-920nm. Narrow ranges can be used as well, such as 800-900nm. In one 
embodiment, the a peptides of Rs. rubrum, Rb sphaeroides, and Rb. capsulatus are reverse translated into C. 
reinhardtii most preferred codons (see sequences from Davis, Biochemistry. 1997 Mar 25;36(12):3671-9.). These a 
peptide genes, encoding amino acids only in C. reinhardtii most preferred codons, are shuffled. The p peptides 
from the above three organisms, also as shown in Davis, are also reverse translated into C. reinhardtii most 
preferred codons and shuffled. The shuffled a and p peptides are cloned into expression vectors and used to 
transform C. reinhardtii. Preferably the a and p peptide sequences also include targeting domains that cause the 
expressed proteins to be embedded in light harvesting complexes of the C. reinhardtii thylakoid membrane. The 
transformed population is cultured under light of a wavelength above 700nm, preferably above 750 nm, more 
preferably above 800nm. Surviving strains are then assayed for hydrogen production in light of a wavelength above 
700nm, preferably above 750 nm, more preferably above 800nm. 

[099] In another embodiment, shuffling is performed using nucleic acid molecules encoding nickel-iron 
hydrogenase proteins, such as those in SEQ ID NOs: 113-122. Because these Ni-Fe hydrogenases are made of 
alpha and beta subunits, preferably the nucleic acid molecules encoding segments of each protein are shuffled in 
separate reactions. The shuffled libraries are expressed in cells that possess Ni-Iron hydrogenaserogenase 
maturation enzymes, such as E. coli. 

Transforming cells with mutagenized nucleic acid sequences 

[100] Cell transformation methods and selectable markers for photosynthetic bacteria and cyanobacteria are well 
known in the art (Wirth, Mol Gen Genet 1989 Mar;2 16(1): 175-7; Koksharova, Appl Microbiol Biotechnol 2002 
Feb;58(2):123-37; Thelwell). Transformation methods and selectable markers for use in bacteria are well known 
(Maniatis et al. (1989) Molecular Cloning : A Laboratory Manual Cold Spring Harbor Laboratory). 
[101] In green algae, the nuclear, mitochondrial, and chloroplast genomes are transformed through a variety of 
known methods. (Kindle, J Cell Biol (1989) Dec;109(6 Pt 1):2589-601; Kindle, Proc Natl Acad Sci U S A (1990) 
Feb;87(3):1228-32; Kindle, Proc Natl Acad Sci U S A (1991) Mar l;88(5):1721-5; Shimogawara, Genetics (1998) 
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Apr;148(4):1821-8; Boynton, Science (1988) Jun 10;240(4858): 1534-8; Boynton, Methods Enzymol (1996) 
264:279-96; Randolph-Anderson, Mol Gen Genet (1993) Jan;236(2-3):235-44). 

[102] Selectable markers for use in Chlamydomonas are known, including but not limited to markers imparting 
spectinomycin resistance (Fargo, Mol Cell Biol (1999) Oct;19(10):6980-90), kanamycin and amikacin resistance 
5 (Bateman, Mol Gen Genet (2000) Apr;263(3):404-10), zeomycin and phleomycin resistance (Stevens, Mol Gen 
Genet (1996) Apr 24;251(l):23-30), and paromycin and neomycin resistance (Sizova, Gene (2001) Oct 17;277(1- 
2):221-9). 

[103] Screenable markers are available in Chlamydomonas, such as the green fluorescent protein (Fuhrmann, Plant 
J (1999) Aug;19(3):353-61) and the Renilla luciferase gene (Minko, Mol Gen Genet (1999) Oct;262(3):421-5). 

10 Fluorescent proteins are also available for prokaryotic organisms. 

[104] In one embodiment, libraries of gene sequences that encode proteins that physically interact are shuffled. 
Nucleic acid constructs are used for transformation procedures that contain a shuffled version of each gene. 
Sequences that encode proteins that interact in ways more conducive to the generation of hydrogen are screened or 
selected for. By mutagenizing sequences encoding proteins that physically interact, more favorable interactions are 

15 generated that lead to the production of increased levels of hydrogen. Examples of such proteins in the hydrogen 
production pathway that physically interact are iron-hydrogenase/ferredoxin and proteins in the photosystem I, 
photosystem II, and cytochrome b 6 -f complexes. It is advantageous but not necessary to use pairs or sets of genes 
that encode proteins that physically interact from the same organisms. Providing interacting pairs or sets in the 
shuffling procedure increases the odds of obtaining favorable functional interactions due to the possibility of 

20 obtaining shuffled sequences on the same test construct that contain complementary interaction domains from the 
same organism, regardless of the sequence flanking either side of the interaction domain in any of the sequences. 
[105] In one embodiment, a library of sequences corresponding to at least one mutagenized nucleic acid sequence 
derived from a differentially regulated gene is inserted into cells through a transformation procedure. Cells that 
have been transformed with the library are then put through a screening or selection process in which the cells are 

25 assayed for the ability to generate an increased amount of hydrogen when compared to the non-transformed strain 
or the strain transformed with only vector and/or screenable/selectable marker sequences. 

Screening or Selecting for a Cell that Generates an Increased Amount of Hydrogen 

[106] Cells are screened for the ability to produce hydrogen by a variety of methods. One method involves the use 
30 of gas chromatography, which is a well known method of detecting gases such as hydrogen. An intake device 
attached to the gas chromatography machine is placed in close enough proximity to the cell culture container or 
plate that it can detect, and preferably quantify, the hydrogen produced by the cells (U.S. Patent 5,100,781). 
[107] Oxygen content may be manipulated in the culture container. The amount of oxygen in the culture container 
may be directly adjusted through gas exchange or indirectly by allowing or inducing the water-splitting mechanism 
35 of photosynthesis. The oxygen content, like all other culture parameters, may be manipulated throughout the 
culture period or held constant. The presence of some amount of oxygen is preferred if MNZ is added to the culture 
media. Preferred hydrogenase genes are capable of catalyzing the production of hydrogen in the presence of 
oxygen. A preferable amount of oxygen in a culture of commercially deployed cells for hydrogen production is an 
atmospheric level such as approximately 21%. Several rounds of screening or selection may be performed in which 
40 the oxygen content of the culture container may be increased between each successive round while hydrogen 
production is assayed. For example, a culture is exposed to 5% oxygen in the first screening or selection round, 
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10% oxygen in the second screening or selection round, 15% oxygen in the third screening or selection round, and 
20% oxygen in the fourth screening or selection round. Other levels of oxygen that can be tested include more than 
0.5%, more than 5.0%, more than 10%, more than 15%, approximately 21%, more than 21%, more than 25%, more 
than 30% or more than 35%. 

[108] In one embodiment, the screening assay is a chemochromic film, that turns from transparent to opaque in the 
presence of hydrogen. The assay is performed by placing films over arrays of multiwell plates containing libraries 
of C. reinhardtii transformants. As shown in figure 8, independent transformants are cultured in multiwell plates. 
The film seals each well. Hydrogen produced by cells is reversibly coordinated to the transition metal in the film, 
causing the film to go from transparent to opaque in a quantitative fashion. The film is photographed with digital 
imaging equipment and cells from wells corresponding to spots darker than the starting strain are selected for 
further rounds of mutagenesis. 

[109] The assay is performed using a platform in which a variety of parameters are manipulated. The platform 
contains an enclosed chamber in which multiwell plates are exposed to a controlled gas environment. Lights are 
positioned over the chamber such that daylight/nighttime conditions may be mimicked. The temperature of the 
chamber may be manipulated corresponding to colder nighttime temperatures followed by warmer daytime 
temperatures. The platform allows the directed evolution procedure to create novel microbe strains that are best 
suited for commercial deployment. For example, in one embodiment strains that can produce hydrogen for 
hundreds of hours using constant light at a constant temperature are assayed for; in a second embodiment strains 
capable of producing large amounts of hydrogen during a warmer 12 hour light period after being exposed to a 
colder 12 hour dark period are assayed for. Strains produced by the second embodiment are best suited for 
commercial deployment because they are best able to conserve energy at night when the photosynthetic electron 
transport chain is not functional. 

[110] In one embodiment, the hydrogen production assay mimics commercial deployment conditions through the 
use of deep-well plates made from non-transparent plastic material. When mutants are assayed for hydrogen 
production, the light available to the cells comes only from directly above the plates, mimicking conditions under 
which cells in a large bioreactor are exposed to light. Mutations that attenuate phototaxis (swimming towards light) 
under bright light conditions (but not dim conditions) prevent cells from accumulating at the surface of the media 
and blocking photons from penetrating deeper into the media. Mutations in the antenna complexes also enhance 
photon utilization efficiency. 

[Ill] In one embodiment, cells transformed with mutagenized nucleic acid sequences are cultured under conditions 
in which gas in the culture container comprises 5% oxygen. Cells that generate an increased amount of hydrogen 
are recovered and mutagenized nucleic acid sequences are recovered from the cells. The mutagenized nucleic acid 
sequences are put through a further mutagenesis round and are used to transform cells. The transformed cells are 
cultured under 21% oxygen. Mutagenized nucleic acid sequences corresponding to differentially regulated genes 
whose wild type sequence encodes proteins that do not function or minimally function in atmospheric oxygen 
levels, such as the C. reinhardtii iron hydrogenase, provide oxygen tolerant variants to the transformed cells. 
Shuffling protocols that include versions of genes that possess desirable characteristics, such as the iron 
hydrogenase gene from Desulfovibrio vulgaris, which is reversibly inactivated by oxygen, are likely to generate 
shuffled genes with multiple desirable characteristics from different parent genes. 

[112] In one embodiment cells transformed with mutagenized nucleic acid sequences are cultured in the presence 
of metronidazole and are selected for the ability to produce increased amounts of hydrogen according to known 
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methods (U.S. Patent 5,871,952). 

[113] Alternatively other sensing methods are utilized. Compounds that reversibly react with hydrogen are used to 
synthesize films that are placed either directly on or in proximity to distinct colonies on culture plates or culture 
containers. The film changes a detectable characteristic in the presence of hydrogen, such as a change of color or a 
5 change from clear to opaque. In one embodiment, a substrate containing a hydrogen-dissociative catalyst metal 
such as tungsten trioxide is placed on or near colonies of cells and turns from transparent to blue/opaque in the 
presence of hydrogen (U.S. Patent 6,277,589). 

[114] There are other methods, both direct and indirect, that are used to detect hydrogen, such as spectroscopic 
methods (U.S. Patent 6,309,604). Other types of gas sensors suitable for detection of hydrogen are well known in 
10 the art. 

[115] Colonies of cells transformed with mutagenized sequences corresponding to differentially regulated genes 
that produce an increased amount of hydrogen under a given set of conditions than the starting strain or cells 
transformed with only vector and/or marker sequences are identified in this screening step. These novel strains are 
then utilized for the production of hydrogen. 

15 [116] In one embodiment, the DNA construct, or substantial parts of the DNA construct, containing the 
mutagenized sequences is cloned, amplified, or otherwise recovered from a first strain that generates an increased 
amount of hydrogen. The DNA construct is put through further mutagenesis protocols to generate a new library of 
DNA constructs used for further screening or selection of new strains that generate increased amounts of hydrogen 
compared to the originally identified first strain. 

20 [117] Nucleic acid constructs used for transforming cells may be in circular form or linear form. In addition, such 
constructs may be comprised of DNA or RNA. For instance, bacterial artificial chromosomes may utilized and are 
comprised of DNA. Alternatively, RNA vectors, such as viruses, may also be used. Viral transformation protocols 
for microbes are well known in the art. 

[118] In one embodiment, cells are screened for increased production of hydrogen in a high-throughput fashion 
25 after being grown on solid culture media. Colonies are identified as novel strains that produce increased amounts of 
hydrogen. The mutagenized sequences that impart the phenotype of the ability to produce increased amounts of 
hydrogen are isolated from each strain of the plurality of colonies. The isolated sequences are then put through 
another round of shuffling, in which the sequences are randomly cleaved, denatured, reannealed, and extended 
using a polymerase to generate a new library of mutagenized sequences. The sequences are then used to transform 
30 strains of the host microbe in a new round of screening or selection to generate further novel strains that produce 
increased amounts of hydrogen compared to the previous plurality of colonies. This process is repeated as many 
times as desired. High throughput methods of manipulating cells are well known in the art, and cells can be plated 
on solid media in densities of 9 colonies or more per square inch (Hicks, Plant Physiol 2001 Dec; 127(4): 1334-8). 

35 Mating of Strains 

[119] In one embodiment, different differentially regulated genes are mutagenized and used to transform cells for 
screening or selection for transformants that generate an increased amount of hydrogen. Transformants that have 
been transformed with mutagenized nucleic acid sequences corresponding to different differentially regulated genes 
are then mated to each other to provide progeny containing different combinations of mutagenized nucleic acid 
40 sequences. The progeny are then screened or selected for the ability to generate an increased amount of hydrogen. 
Screenable or selectable markers may be excised through such techniques as the Cre-lox system or FLP 
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recombinase. Mating protocols, such as protoplast fusion, are known in the art. In addition, mating protocols for 
organisms such as green algae are also known (Harris, (1989) The Chlamydomonas Sourcebook. Academic Press, 
New York). 

[120] In another embodiment, cells that produce an increased amount of hydrogen due to random mutagenesis, 
5 such as chemical or insertion mutagenesis, are mated to cells that produce an increased amount of hydrogen due to 
mutagenized nucleic acid sequences corresponding to genes that are involved in a hydrogen production pathway. 
The progeny from the mating are screened or selected for the ability to generate an increased amount of hydrogen 
compared to any parental strain. Any strain that differs in genome sequence from a wild-type strain that produces 
an increased amount of hydrogen compared to the strain from which it is derived can be mated to a second strain 
10 distinct in genome sequence from the first strain that also produces an increased amount of hydrogen compared to 
the strain from which it is derived. Progeny from the mating are screened or selected for the ability to produce an 
increased amount of hydrogen compared to either parent. This type of mating, referred to as pairwise mating, is 
depicted in figure 1 1 . 

[121] In another embodiment, three or more strains that have distinct genome sequences and produce an increased 

15 amount of hydrogen are mated to each other in a multiparental mating reaction, and the progeny are screened or 
selected for the ability to produce an increased amount of hydrogen compared parental strains. In green algae 
multiparental mating, cells are induced to undergo gametogenesis by removing nitrogen from the media. Cells mate 
to form zygospores. The cells are induced to germinate by adding nitrogen back to trie media. The population is 
then induced to mate again by removing nitrogen to induce gametogenesis again, followed by adding nitrogen back 

20 to the media. The process can be repeated as many times as desired, allowing for shuffling of genomes. Because 
green algae are of mating type + or and because cells only mate with cells of the opposite mating type, at least one 
strain in the multiparental mating reaction must be of opposite mating type from at least one other strain in the 
reaction. Multiparental mating is described further in Example 3 and is depicted in figure 12. Multiparental mating 
in green algae such as Chlamydomonas can be achieved through cycling the level of nitrogen in the media and 

25 allowing the different strains to mate and produce progeny. Preferably more than one nitrogen deprivation mating 
cycle is performed before the cells are screened or elected for a desired phenotype. Multiparental mating allows 
multiple advantageous genetic alterations in the genome sequence of distinct strains to be concentrated into a single 
genome, allowing the individual phenotypic effect of each genetic alteration to be exerted in the presence of the 
other phenotypic effects of the other genetic alterations. Concentrating multiple advantageous genetic alterations 

30 therefore allows for additive or synergistic effects of multiple genetic alterations to achieved. In one embodiment, 
the progeny of the mating are screened for the ability to generate an increased amount of hydrogen compared to all 
parental strains using multiwell plates containing photoautotrophic culture media, where chemochromic films are 
placed over the multiwell plates. A major advantage of multiparental mating is that genetic alterations that 
originate in cells of the same mating type can be put into the same strain through repeated nitrogen cycling in a 

35 mating reaction. Progeny from multiparental mating reactions can be screened or selected for any desired 
phenotype, including hydrogen production, dissolved solid transport in or out of cells* ability to survive in certain 
environments such as high sunlight, low sunlight, or light of a certain wavelength, or ability to survive in 
environments such as high salt, low salt or brackish water, the ability to bind or decompose an environmental 
pollutant such as PCBs, heavy metals, dioxins, and other molecules, the ability to live on a certain food source, die 

40 ability to synthesize a desired molecule, a large number of chloroplasts per cell, and any other desired phenotype. 

[122] In another mating embodiment that can be performed as pairwise or multiparental mating, a library of C. 

22 



WO 2005/072262 PCT/US2005/001983 
reinhardtii strains, isolated from geographically diverse regions and containing naturally occurring single nucleotide 
polymorphisms (SNPs), is subjected to mating and screening or selection for a desired phenotype such as hydrogen 
production. The strains are subjected to the above-described mating protocols, with or without mutagenesis of the 
strains before or after mating. In one embodiment, the cells are transformed with an expression vector 
5 constitutively expressing an iron hydrogenase before they are mated and screened or selected for the ability to 
generate an increased amount of hydrogen. In one embodiment, the strains that are subjected to mating are selected 
from the group of strains comprising (using the strain numbers of the Chlamydomonas Genetics Center, Duke 
University): CC-124, CC-125, CC-1690, CC-1692, CC-407, CC-408, CC-1952, CC-2290, CC-2342, CC-2343, CC- 
2344, CC-2931, CC-2932, CC-2935, CC-2936, CC-2937, CC-2938, CC-2935, CC-2936, CC-2937, CC-2938, CC- 

10 3059, CC-3060, CC-3061, CC-3062, CC-3063, CC-3064, CC-3065, CC-3067, CC-3068, CC-3071, CC-3073, CC- 
3074, CC-3075, CC-3076, CC-3078, CC-3079, CC-3080, CC-3082, CC-3083, CC-3084, CC-3086, CC-1373 and 
CC-3087. These strains were isolated from geographically diverse regions and contain SNPs relative to each 
other's genome. These strains can also be assayed for phenotypes other than hydrogen production, such as those 
described in the preceding paragraph. 

15 [123] The multiparental mating can also be between cells other than Chlamydomonas, and the stimulus to induce 
gametogenesis can be other than nitrogen or other nutrient deprivation. For example, the stimulus can be the 
removal of light during exponential growth followed by addition of light in mating reactions with diatoms such as 
T. weissfloggi (Armbrust EV Appl Environ Microbiol. 1999 Jul;65(7):3 121-8). Alternatively, the stimulus can be 
addition of a compound or element such as 1 mg/liter Chromium (VI) to cells such as Scenedesmus acutus (Corradi, 

20 Ecotoxicol Environ Saf. 1995 Oct;32(l): 12-8; Corradi, Ecotoxicol Environ Saf. 1995 Mar;30(2): 106- 10. ). 

[124] In another embodiment, promoter sequences from a plurality of genes in the genome of an organism are used 
to transform cells, followed by screening or selection for a desired phenotype. For example, a plurality of 500, 
1000, 1500, 2000, or more base pair promoters are amplified from the C. reinhardtii genome. The full genome 
sequence has been completed and can be found at http://genome.igi-psf.org/chirel/chlrel.home.html . The promoter 

25 sequences are connected to a selectable marker sequence and used to transform the nuclear and/or chloroplast 
and/or mitochondrial genome. The surviving transformants are screened or selected for a desired phenotype. 
Preferably, the transformants are screened for a phenotype related to a metabolic function such as the ability to 
produce hydrogen. Optionally, independent transformants of promoter contructs that produce an increased amount 
of hydrogen are mated and the progeny are screened for a further increased amount of hydrogen over any of the 

30 parents. The mating can be paiurwise or multiparental. 



Methods of producing hydrogen 

[125] In one embodiment, cells containing mutagenized nucleic acid sequences and capable of producing an 
increased amount of hydrogen are cultured in a culture container with a transparent top section in an outdoor 

35 environment. Cells are grown in minimal culture media containing water, trace amounts of metals, and inorganic 
salts. Preferably only photoautotrophic organisms can live in the media. Atmospheric air contacts the top surface 
of the culture media. Nucleic acid sequences that are involved in the production of hydrogen are transcribed from 
constitutive, light-induced, or dark-induced promoters. Hydrogen evolved from cells is removed from the top of the 
culture container. During non-daylight hours, cells, for example, become dormant, metabolize molecules such as 

40 acetate to replenish substrate for digestion and hydrogen production during daylight, or produce hydrogen through a 
non-photosynthetic pathway. Optionally, cells are synchronized to the same phase of the cell cycle when producing 
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EXAMPLE 1 

[126] Step 1: Sequence design : Unique sequences a-1 were searched for similarity to known sequences in the 
5 Chlamydomonas genome using the WU-Blast 2.0 program on databases of the Chlamydomonas Genome Project, 
located at (http://www.biology.duke.edu/chlamy_genome/blast/blast_form.html). The search produced no high 
scoring segment pairs. The following databases were searched: Contig Set, EST clones, S1D2 ESTs, Volvocales 
(non-EST), and BAC-ends (JGI). Searches were performed using the WU-blastn program using the default matrix 
blosum62. Gapped alignments were allowed for. The default expected threshold, filter, word length, and cutoff 

10 scores were used. The sum statistics option was used for assessing the significance of aligned pairs. Primer and 
chimeric oligonucleotide sequences were designed using sequences from the lhcbl gene promoter (SEQ. ID NO 1), 
the 3' untranslated region of the RBCS2 gene (SEQ. ID NO 3), and a selectable marker cassette (SEQ. ID NO 2). 
[127] Step 2: Culturing microbes under conditions not conducive and more conducive to the generation of 
hydrogen : Chlamydomonas reinhardtii (strain cc-124, Chlamydomonas Genetics Center, Duke University, 

15 Durham, N.C.) is cultured under conditions not conducive to the generation of hydrogen (photoheterotrophically on 
Tris-acetate-phosphate medium (TAP), pH 7.2 (Harris, (1989) The Chlamydomonas Sourcebook. Academic Press, 
New York; Melis, Plant Physiol (2000) Jan; 122(1): 127-36). The culture is bubbled with 3% C0 2 in air, stirred 
gently (at approximately 400 rpm) at 25° C, under continuous illumination (approximately 300 fiE rn 2 s" 1 ). The 
cells are grown until mid-log phase (approximately 4 x 10 6 cells ml/ 1 ) and then harvested by centrifugation at 2000 

20 x g for 5 minutes. The pellet is divided half. mRNA is purified from one half of the pellet immediately after 
harvesting, as specified below, without freezing. The other half is washed 2 times in TAP-minus-sulfur and 
resuspended in the same medium to a final concentration of 4-5 x 10 6 cells mL" 1 (Zhang, Planta (2002) 
Feb;214(4):552-61; Melis, Plant Physiol (2000) Jan; 122(1): 127-36). The cells are cultured in containers sealed 
from the atmosphere, under illumination (approximately 300 \iE m" 2 s" 1 ), and are gently stirred at approximately 400 

25 rpm. The containers allow gas evolved from the algae to escape into the atmosphere but do not allow atmospheric 
gas to enter the culture. The cells are cultured under these conditions for approximately 60 hours. The cells are 
then harvested by centrifugation at 2000 x g for 5 minutes. RNA is purified immediately after harvesting, without 
freezing of the cell pellet. 

[128] Step 3: mRNA purification : mRNA is purified from both sets of cells using the Qiagen Oligotex® system 
30 (compositions of buffers OL1, ODB, and OW1 are proprietary; these buffers are purchased directly from Qiagen 
Inc., Valencia, CA). DEPC-treated water is used to make all buffers. 2-5 x 10 7 cells are separated from the pellet 
for mRNA purification. The Oligotex® reagent is heated to 37°C in a water bath, vortexed, and set out at room 
temperature. 5mM Tris»Cl pH 7.5 is heated at 70°C. All supernatant is removed from cell pellets. 800 [xL of 10 
mM Tris«Cl pH 7.5, 140 mM NaCl, 5 mM KC1, 1% Nonidet P-40, 1 mM DTT, and (optionally with RNase 
35 inhibitors added), chilled at 4°C, is added and the pellet is resuspended. The suspension is incubated on ice for 5 
minutes. The suspension is pelleted in a microcentrifuge tube for 2 minutes at between 300-500 x g at 4°C. The 
supernatant is transferred to a new tube. 800 uX of room temperature 1M LiCl, 20 mM Tris«Cl pH 7.5, 2 mM 
EDTA, 1% SDS and 145 U-L of the Oligotex® suspension are added to the supernatant, which is then vortexed. The 
resulting mixture is then incubated at 70°C for 3 minutes and then at 20-30°C for 10 minutes. The mixture is 
40 pelleted in a microcentrifuge at 14,000-18,000 x g for 5 minutes. The supernatant is removed. The pellet is 
resuspended in 200 |jiL of Qiagen buffer OL1 (containing 14.3 u,L 6-mercaptoethanol per mL of OL1). 800 \ih of 
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Qiagen buffer ODB is added and the suspension is incubated at 70°C for 3 minutes and room temperature for 10 
minutes. The suspension is then pelleted in a microcentrifuge at maximum speed for 5 minutes. The supernatant is 
removed. The pellet is then resuspended in 600 uL of Qiagen buffer OW1. The suspension is then pipetted onto a 
large Qiagen Oligotex spin column placed inside a 2 mL microcentrifuge tube and is centrifuged for 1 minute at 
maximum speed. The spin column is then placed in an RNase-free 2 mL microcentrifuge tube. 600 u-L of 10 mM 
Tris»Cl pH 7.5, 1 mM EDTA, 150 mM NaCl is added to the spin column, which is then centrifuged for 1 minute at 
maximum speed. The flow through is discarded and 600 uL of 10 mM Tris-Cl pH 7.5, 1 mM EDTA, 150 mM 
NaCl is added to the spin column, which is then centrifuged again for 1 minute at maximum speed. The spin 
column is then placed in a new Rnase-free 2 mL microcentrifuge tube. Approximately 200 \iL of 70°C 5 mM 
Tris»Cl pH 7.5 is added to the spin column. The resin is resuspended by pipetting the buffer :resin mix several 
times. The spin column is then centrifuged for 1 minute at maximum speed. The flow through is pipetted to a new 
RNase-free tube. The elution process is repeated with another 200 uL of 70°C 5 mM Tris»Cl pH 7.5 and the flow 
through is added to the first flow through. The concentration and purity of the RNA is analyzed using 
spectrophotornetric analysis. 

[129] Step 4: cDNA synthesis and in vitro transcription : Double stranded, labeled, cDNA is synthesized from the 
purified mRNA samples using the Invitrogen Life Technologies Superscript® Choice system (Invitrogen Inc., 
Carlsbad, Ca.). mRNA samples from cells cultured under conditions not conducive to the generation of hydrogen 
and from cells cultured under conditions more conducive to the generation of hydrogen are processed 
simultaneously. 4 u.g of mRNA from each sample are put into RNAse-free microcentrifuge tubes, along with 100 
pmol HPLC-purified primer of the sequence 

5 ' -GGCC AGTGA ATTGTAATACG ACTC ACTATAGGG AGGCGG-(dT) 2 4-3 ' . The tube is incubated at 70°C for 
10 minutes, briefly centrifuged, and placed on ice for 5 minutes. The following reagents are added: (1) 1 |xL 10 
mM dNTP mix; (2) 2 uL 100 mM DTT; (3) 4 uL 5X first strand cDNA buffer (proprietary composition, available 
from Invitrogen Inc, Carlsbad, Ca.). The reaction is then incubated at 37°C for 2 minutes. 4 uL of 20OU/>L 
Superscript® II reverse transcriptase is added to the reaction to make a final volume of 20 uL. The reaction is then 
incubated at 37°C for 1 hour. The reaction is then placed on ice and the following regents are added and mixed: 91 
u-L of DEPC-treated water, 30 uL of 5X second strand reaction buffer (proprietary composition, available from 
Invitrogen Inc, Carsbad, Ca.), 3 \iL of 10 mM dNTP mix, 1 uL of 10 U/nL E. coli DNA ligase, 4 uL of 10 U/uL E. 
coli DNA polymerase I, and 1 uL of 2 U/jjiL E. coli Rnase H. The reaction is incubated at 16°C for 2 hours. 2 
of 5U/uX T4 DNA Polymerase is added to the reaction and it is incubated for 5 minutes at 16°C. 10 uL 0.5M 
EDTA is added to the reaction. 

[130] The reaction is put through a phenol: chloroform extraction using a Phase-Lock gel (optionally the reaction is 
put through a standard phenohchloroform extraction). The Phase-Lock gel is pelleted in a 1.5 mL microcentrifuge 
tube at 12,000 x g for 30 seconds. 162 fxL of 25:24:1 phenol: chloroforrmisoamyl alcohol (saturated with 10 mM 
Tris-HCl pH 8.0, 1 mM EDTA) is added to the 162 uL reaction to a total 324 uL. The mixture is briefly vortexied, 
and the entire 324 uL is then added to the Phase-Lock gel tube. The tube is centrifuged at >12,000 x g for 2 
minutes. The upper aqueous layer containing the cDNAs is transferred to a new 1.5 mL tube. 0.5 volumes of 7.5 M 
NH 4 OAc and 2.5 volumes of 100% ethanol are added to the cDNAs. The tube is vortexed and then centrifuged at 
> 12,000 x g for 20 minutes. The supernatant is removed and the pellet is washed with 500 uL of 80% ethanol. The 
tube is then centrifuged at >12,000 x g f or 5 minutes. The wash is repeated once. The pellet is then air dried and 
resuspended in 12 uL RNase-free water. The cDNA sample from cells cultured under conditions conducive to the 
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generation of hydrogen is labeled as the "conducive C. rein sample." The cDNA sample from cells cultured under 
conditions not conducive to the generation of hydrogen is labeled as the "nonconducive C. rein sample." The 
cDNA samples are put through in vitro transcription reactions and are biotin labeled using the Enzo® Bio Array® 
High Yield RNA Labeling Kit (available as part No. 900182 from Affymetrix Inc. Santa Clara, CA). 
5 [131] Step 5: Labeled in vitro transcript purification : Total amounts of RNA generated from the in vitro 
transcription reactions are determined by spectrophotometric and/or gel electrophoresis. Biotin-labeled RNA 
samples that originated from cells cultured under conditions not conducive to the generation of hydrogen and 
biotin-labeled RNA samples that originated from cells cultured under conditions more conducive to the generation 
of hydrogen are processed simultaneously. 60Q-800 [xg of biotin-labeled RNA are purified on Qiagen RNeasy® 

10 midi columns. All centrifugations and reactions are performed at room temperature. For smaller or larger amounts 
of biotin-labeled RNA, mini or maxi columns are used, respectively, along with modified protocols according to the 
manufacturer. The labeled RNA is added to a tube, and is brought up to a volume of 1 mL with RNAse-free water. 
4 mL of buffer RLT is added (compositions of buffers RLT, RW1, and RPE are proprietary; these buffers are 
purchased directly from Qiagen Inc., Valencia, CA) and the sample is mixed. 2.8 mL 100% ethanol and the sample 

15 is mixed. The sample is immediately applied to a Qiagen RNeasy® midi column, which is placed in a 50 mL tube, 
and centrifuged 5 minutes at 3,000-5,000 x g. The flow through is discarded. 2.5 mL of buffer RPE is added to the 
column, which is then centrifuged 2 minutes at 3,000-5,000 x g. The flow through is discarded. 2.5 mL of buffer 
RPE is again added to the column, which is then centrifuged 5 minutes at 3,000-5,000 x g. The column is placed in 
a new 15 mL RNase-free tube. 250 |liL of RNase-free water is added to the column. The column is allowed to sit 

20 for 1 minute and is then centrifuged 3 minutes at 3,000-5,000 x g. Another 250 \iL of RNase-free water is added to 
the column. The column is allowed to sit for 1 minute and is then centrifuged 3 minutes at 3,000-5,000 x g. The 
concentration of the eluted biotin-labeled RNA is measured spectrophotometrically. If the concentration is less than 
0.6 u-g/ixL, the biotin-labeled RNA is precipitated by adding 0.5 volumes 7.5 M NH 4 OAc and 2.5 volumes 100% 
ethanol and resuspended in a smaller volume of RNase free water. The tube is vortexed and then placed at -20°C 

25 for at least 1 hour. The tube is centrifuged at >12,000 x g at 4°C for 30 minutes. The pellet is washed twice with 
500 \iL of ~20°C 80% ethanol. The pellet is air dried and resuspended in 10 uL RNase-free water. The 
concentration of biotin-labeled RNA is adjusted to 2 u-g/uL. 

[132] Step 6: Labeled in vitro transcript fragmentation :12 u.L of 2 u-g/u-L biotin-labeled RNA is added to an 
RNase-free tube along with 3 uL of 5X fragmentation buffer (200 mM Tris-acetate pH 8.1, 500 mM KOAc, 150 
30 mM MgOAc). The tube is placed at 94°C for 35 minutes and then placed on ice. The biotin-labeled RNA is 
fragmented into sizes from approximately 35-200 nucleotides, and this is confirmed by gel electrophoresis using 
appropriate size markers. 

[133] Step 7: Microarrav hybridization and differential expression profile creation : Microarray chips containing 
2,761 unique C. reinhardtii sequences are obtained from the Chlamydomonas Genome Project (Duke University, 

35 Durham, N.C. http://www.biology.duke.edu/chlamv genome/microarravs.html\ Sequence IDs and grid 

locations for clones are obtained from the same source (at 

ftp://ftp.biology.duke.edu/pub/chlamv genome/sequences/ ). Fragmented biotin labeled RNA samples are 
hybridized to C. reinhardtii microarrays according to Affymetrix GeneChip Expression Analysis protocols 
(Affymetrix Inc., Santa Clara, CA.). Microarrays with labeled nonconductive RNA samples hybridized and 

40 microarrays with labeled conducive RNA samples hybridized are compared and analyzed for identification of 
differentially regulated genes. The microarray data set containing the expression data from cells cultured under 
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conditions not conducive to the generation of hydrogen and cells cultured under conditions more conducive to the 
generation of hydrogen is a differential expression profile. 

[134] Step 8: Creation of probes corresponding to differentially regulated genes : Genes that exhibit greater than a 
1.5-fold difference in expression between cells cultured under conditions not conducive to the generation of 
hydrogen and cells cultured under conditions more conducive to the generation of hydrogen are identified as 
differentially regulated genes. The 5 genes (referred to hereinafter as the 1 H 2 , 2 H 2 , 3H 2 , 4 H 2 , and 5 H 2 genes, and 
collectively as the 1-5 H 2 set) that are not expressed in cells cultured under conditions not conducive to the 
generation of hydrogen and are upregulated most compared to other upregulated genes when cells are switched 
from conditions not conducive to the generation of hydrogen to conditions more conducive to the generation of 
hydrogen are selected for mutagenesis. Alternatively, the iron-hydrogenase gene is designated as on of the 5 genes, 
regardless of its expression level relative to other genes. PCR primers are designed corresponding to a 50-200 base 
pair segment of each gene of the 1-5 H 2 set, wherein the segment chosen does not contain a specific restriction 
enzyme site corresponding to restriction enzymes that leave 5' overhangs at cut sites. For example, the restriction 
enzymes BamHI, Hind III, and Bgl II leave 5' overhangs after cutting double stranded DNA. The PCR primers 
contain the restriction enzyme sequence chosen at their 5' end. The primers are used to amplify their corresponding 
fragment from each gene of the 1-5 H 2 set using the conducive C. rein cDNA sample as a template. PCR products 
are digested with the restriction enzyme corresponding to the ends of amplified fragments. The PCR products are 
purified from the digested ends using agarose gel electrophoresis and electroelution from the gel fragment. The 
electroeluted PCR products, referred to hereinafter as the 1-5 H 2 set probes, are precipitated from the electroelution 
buffer with 0.5 volumes of 7.5 M NH 4 OAc and 2 volumes of -20°C 100% ethanol. The 1-5 H 2 set probes are 
pelleted at 14,000 x g. The pellets are washed two times with -20°C 70% ethanol. The pellets are dried and 
resuspended in water. 

[135] Step 9: Culturing microbes capable of producing hydrogen and creation o f cDNA libraries: The following 
species of Chlamydomonas are cultured under conditions more conducive to the generation of hydrogen (available 
from the UTEX collection at The University of Texas at Austin, Austin, TX): (1) Chlamydomonas pulvinata 
(UTEX strain number 212, isolated from Switzerland); (2) Chlamydomonas pygmaea (UTEX strain number 2539, 
isolated from Prudhoe Bay, Alaska); (3) Chlamydomonas radiata (UTEX strain number 966, isolated from 
McMahan, Texas); (4) Chlamydomonas rapa (UTEX strain number 1342, isolated from Danube River, Bratislava, 
Czechoslovakia); (5) Chlamydomonas sajao (UTEX strain number 2277, isolated from Sa Jiao, China); (6) 
Chlamydomonas segnis 222 (UTEX strain number 222, isolated from West Humble, Surrey, England); (7) 
Chlamydomonas segnis 1638 (UTEX strain number 1638, isolated from Dauphin Is., Alabama, U.S.A.); (8) 
Chlamydomonas segnis 1919 (UTEX strain number 1919, isolated from Delta Marsh; Manitoba, Canada); (9) 
Chlamydomonas smithii (UTEX strain number 1061, isolated from Santa Cruz, California, U.S.A.); (10) 
Chlamydomonas sphaeroides (UTEX strain number 221, isolated from India); (11) Chlamydomonas surtseyiensis 
(UTEX strain number 1796, isolated from Surtsey, Iceland); (12) Chlamydomonas ulvaensis (UTEX strain number 
724, isolated from Ulva Island, Scotland); (13) Chlamydomonas zimbabwiensis (UTEX strain number 2213, 
isolated from Zimbabwe); (14) Chlamydomonas reinhardtii (strain ccl24, Chlamydomonas Genetics Center, Duke 
University, Durham, N.C.). The species are cultured in TAP-minus-sulfur medium. The cells are cultured in 
containers sealed from the atmosphere, under illumination (approximately 300 uE m" 2 s" 1 ), and are gently stirred at 
approximately 400 rpm. The containers allow gas evolved from the algae to escape into the atmosphere but do not 
allow atmospheric gas to enter the culture. The cells are cultured under these conditions for approximately 60 
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hours. The cells are then harvested by centrifugation at 2000 x g for 5 minutes. niRNA is purified immediately 
after harvesting, without freezing of the cell pellets. mRNA is purified from each Chlamydomonas strain as 
previously described using the Qiagen Oligotex® system. 

[136] cDNA libraries are made from each Chlamydomonas mRNA sample. Double stranded cDNA is synthesized 
5 from the purified mRNA samples using the Invitrogen Life Technologies Superscript® Choice system. mRNA 
samples from each Chlamydomonas strain are processed in parallel. 4 fxL of 1 |xg/[xL mRNA in DEPC-treated 
water is added to an RNase-free centrifuge tube. 2 |iL of 0.5 [Ag/uE oligo(dT) 12 - 18 primer and 2 jxL of 50ng/[xL of 
random hexamer primers are added to the mRNA. The sample is heated at 70°C for 10 minutes and immediately 
transferred to ice. The sample is briefly centrifuged and the following components are added: (1) 4 \iL of 250 mM 

10 Tris^HCl pH 8.3, 375 mM KC1, 15 mM MgCl 2 ; (2) 2 U.L of 100 mM DTT; (3) 1 U.L of lOmM dNTPs; (4) 1 1 
jU,Ci/jiL [a- 32 P]dCTP. The reaction is mixed and incubated at 37°C for 2 minutes. 4 jxL of 200 U/u,L of 
Superscript® Reverse Transcriptase II is added to the reaction, which is mixed and incubated at 37°C for one hour 
and then placed on ice. 18 [xL of the reaction is placed into a new tube. The following reagents are also added: (1) 
93 julL of DEPC-treated water; (2) 30 \iL of 100 mM Tris'HCl pH 6.9, 450 mM KC1, 23 mM MgCl 2 , 0.75 mM j3- 

15 NAD + , 50 mM (NH 4 ) 2 S0 4 ; (3) 3 m^L 10 mM dNTPs; (4) 1 \iL of 10 U/jxL E. coli DNA ligase; (5) 4 of 10 U/uL 
E. coli DNA Polymerase I; (6) 1 jxL of 2 U/u-L E. coli RNase H. The reaction is briefly vortexed, briefly 
centrifuged, and incubated for 2 hours at 16°C. 2 jxL of 5 U/jiL T4 DNA Polymerase is added and the reaction is 
incubated 5 minutes at 16°C. The reaction is then placed on ice and 10 jxL of 0.5 M EDTA is added. 150 of 
25:24:1 phenol:chloroform:isoamyl alcohol is added to the reaction, which is then vortexed and centrifuged at room 

20 temperature for 5 minutes at 14,000 x g. 140 jliL of the upper aqueous phase is transferred to a new microcentrifuge 
tube. 70 \iL of 7.5 M NH 4 OAc and 500 [xL of -20°C 100% ethanol are added to the sample. The tube is vortexed 
and centrifuged at room temperature for 5 minutes at 14,000 x g. The supernatant is removed and the pellet is 
washed with 500 u-L of -20°C 70% ethanol. The tube is centrifuged at room temperature for 2 minutes at 14,000 x 
g and the supernatant is discarded. The pellet is dried at 37°C for 10 minutes. The pellet is resuspended in: (1) 18 

25 u.L of DEPC-treated water; (2) 10 (xL of 330 mM Tris»HCl pH 7.6, 50 mM MgCl 2 , 5 mM ATP; (3) 10 \iL of 1 
u.g/u.L EcoRI (Not I) adapters; (4) 7 U.L of 100 mM DTT; (5) 5 \xL of 1 U/[xL T4 DNA ligase. The reaction is 
mixed and incubated for 24 hours at 16°C. The reaction is then incubated at 70°C for 10 minutes and then placed on 
ice. 3 [aL of 10 U/fxL T4 Polynucleotide Kinase is added to the sample, which is mixed and then incubated for 0.5 
hours at 37°C. The reaction is then incubated for 10 minutes at 70°C and placed on ice. For each sample, a 1 mL 

30 pre-packed Sephacryl S-500 HR column is drained of 20% ethanol. 800 u-L of 10 mM Tris^HCl pH 7.5, 0.1 mM 
EDTA, 25 mM NaCl is pipetted onto the top of each column. The column is allowed to drain. The wash is 
performed 3 more times with the same volume. 97 jaL of 10 mM Tris»HCl pH 7.5, 0.1 mM EDTA, 25 rnM NaCl is 
added to each reaction and mixed. The reaction is added to the top of the tube and drained into a first 
microcentrifuge tube. 100 \iL of 10 mM Tris*HCl pH 7.5, 0.1 mM EDTA, 25 mM NaCl is added to the top of the 

35 column and drained into a second microcentrifuge tube. 100 \iL of 10 mM Tris»HCl pH 7.5, 0.1 mM EDTA, 25 
mM NaCl is added to the top of the column and each drop flowing from the bottom of the tube is collected into a 
new tube. The process is continued with 100 \iL of 10 mM Tris«HCI pH 7.5, 0.1 mM EDTA, 25 mM NaCl being 
added to the top of the column until 18 drops are collected in 18 successive tubes numbered 3-20. The volume in 
all 20 tubes is measured. The numerical volume of each tube is added to determine the fraction of column flow 

40 through in each tube. Tubes containing volume collected after 600 p-L of eluate has flowed through the column are 
discarded. The remaining tubes are placed in a scintillation counter and Cerenkov counts for each tube are 
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measured. Tubes containing only background Cerenkov counts are discarded. The concentration of cDNA in each 
remaining fraction is determined according to the Superscript® Choice System for cDNA Synthesis manufacturer's 
recommendations (Invitrogen Inc., Carlsbad, Ca., Catalog Series 18090). Fractions containing more than 0.1 ng/u.L 
cDNA are pooled. The cDNAs are precipitated with 0.5 volumes of 7.5 M NH 4 OAc and 2 volumes of -20°C 100% 
5 ethanol. The sample is vortexed and centrifuged at room temperature for 20 minutes at 14,000 x g. The pellet is 
washed two times with 500 u.L of -20°C 70% ethanol and then dried at 37°C for 10 minutes. The pellet is 
resuspended in 20 uL 10 mM Tris*HCl pH 7.5, 0.1 mM EDTA, 25 mM NaCl. A dilution of each Chlamydomonas 
cDNA is made to yield 10 u.L of 1 ng/u.L cDNA in 10 mM Tris«HCl pH 7.5, 0.1 mM EDTA, 25 mM NaCl. All 
Chlamydomonas cDNA samples are processed in parallel. To each cDNA tube, the following reagents are added: 

10 (1) 4 uL of 250 mM TrisHHCl pH 7.6, 50 mM MgC12, 5 mM ATP, 5 mM DTT, 25% (w/v) Polyethylene glycol 
8000; (2) 5 uL of 10 ng/^tL, EcoRI cut, dephosphorylated plasmid pcDNA3(+) (available from Invitrogen Inc., 
Carlsbad, Ca.); (3) 1 uL of 1 U/fiL T4 DNA ligase. The reaction, hereinafter referred to for each strain as the "X 
strain conducive cDNA library" (such as the Chlamydomonas surtseyiensis conducive cDNA library), is incubated 
3 hours at room temperature and then frozen at — 20°C. 

15 [137] Step 10: Cloning of 1-5 H z set cDNAs : The 1-5 H 2 set probes are labeled with [a- 32 P]dNTPs using the 
Klenow DNA Polymerase fragment (available from New England Biolabs Inc., Beverly, MA) according to standard 
protocols. The conducive cDNA libraries from the fourteen Chlamydomonas strains grown in step 9 are used to 
transform competent E. coli cells using standard protocols. The plated E. coli cells transformed with each of the 
fourteen conducive cDNA libraries are used for cloning cDNAs for each of the 1-5 H 2 set gene homologues from 

20 each of the fourteen conducive cDNA libraries using standard cDNA cloning methods (Maniatis et al. (1989) 
Molecular Cloning : A Laboratory Manual Cold Spring Harbor Laboratory). The probes used to identify each of 
the 1-5 H 2 set gene homologues are the 1-5 H 2 set probes. The identified clones are sequenced. Full length cDNAs 
are obtained using RACE-PCR with mRNA samples from each Chlamydomonas strain as template. A full length 
cDNA from each of the 1-5 H 2 set gene homologues is selected for use in DNA shuffling and is referred to as the X 

25 strain Y H 2 gene (such as the Chlamydomonas pygmaea 3 H 2 gene). A total of 70 cDNA sequences are obtained (a 
1 H 2 , 2 H 2 , 3 H 2 , 4 H 2 , and 5 H 2 gene from each of the 14 Chlamydomonas strains). 

[138] Step 11: Creation of nonshuffled DNA construct segments : Nonshuffled segments I- VIII are generated 
through PCR amplification using primers and templates listed in Table 1. The position of these primers relative to 
the sequence information they contain (not drawn to scale) is depicted in Figure 6 by arrows. Nonshuffled 

30 segments I- VIII are gel purified, electroeluted, and precipitated. The fragments are resuspended in water. 

[139] Step 12: Shuffling of 1-5 H? set coding regions : The coding region of each of the 70 1-5 H 2 set homologue 
genes is amplified using the cDNA plasmid as template and primers corresponding to the N and complement of the 
C terminal portions of the cDNA coding sequences. PCR products corresponding to the coding regions of all 1-5 
H 2 set homologue genes are gel-purified, electroeluted, precipitated, and resuspended in 50 mM Tris # HCl pH 7.4, 1 

35 mM MgCl 2 . Alternatively PCR primers are removed from the reaction using the Wizard® PCR product (Promega 
Corp, Madison, WI) and the PCR products are resuspended in 50 mM Tris'HCl pH 7.4, 1 mM MgCl 2 . Chimeric 
oligonucleotides are synthesized according to Table 2 and are resuspended in 50 mM Tris # HCl pH 7.4, 1 mM 
MgCl 2 . 

[140] 70 PCR products corresponding to the coding regions of all 1-5 H 2 set homologue genes are quantified with 
40 spectrophotometry. Reactions for each of the 1-5 H 2 genes are performed in parallel. Equal molar amounts of each 
cDNA corresponding to each of the 1-5 H 2 set homologue genes are pooled in separate tubes to obtain a total of 4 
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ug DNA in 100 \iL 50 mM Tris^HCl pH 7.4, 1 mM MgCl 2 . In other words, 0.2857 \ig of cDNA from each of the 
14 cDNAs corresponding to the 1 H 2 gene are added to a single tube. 0.2857 \ig of cDNA from each of the 14 
cDNAs corresponding to the 2 H 2 gene are added to a different tube, and so on, such that each H 2 gene is shuffled in 
a separate reaction. DNAse I (obtained from Sigma Corp., St. Louis, MO) is added to each tube at a concentration 
5 of 0.0015 units of Dnase I per \xl of DNA. The digestion reaction proceeds for 15 minutes at room temperature and 
is stopped. Digestion products from approximately 20-150 base pairs are purified from 2% low melting agarose 
gels, electroeluted, and precipitated. An equivalent molar amount of corresponding chimeric oligonucleotides to the 
original starting material for each cDNA is added to each tube. For instance, a 900 base pair 1 H 2 cDNA from one 
of the 14 strains corresponds to 0.481 pmol (1/14 of 4 u-g added to DNAse I digestion reaction converted to pmol 

10 for a 900 base pair double stranded fragment). For 1 H 2 cDNAs of approximately 900 base pairs, 0.481 pmol of 
chimeric oligonucleotides 1.1-1.14 and 0.481 pmol of chimeric oligonucleotides 2.1-2.14 are added to the purified 
fragmented coding regions. Chimeric oligonucleotides 3.1-3.14 and 4.1-4.14 are added to 2 H 2 fragments. 
Chimeric oligonucleotides 5.1-5.14 and 6.1-6.14 are added to 3 H 2 fragments. Chimeric oligonucleotides 7.1-7.14 
and 8.1-8.14 are added to 4 H 2 fragments. Chimeric oligonucleotides 9.1-9.14 and 10.1-10.14 are added to 5 H 2 

15 fragments. Chimeric oligonucleotides and 20-150 base pair cDNA fragments are resuspended in 0.2 mM of each 
dNTP, 2.2 mM MgCl 2 , 50 mM KC1, 10 mM Tris»HCl pH 9.0, 0.1% Triton X-100, to a volume of 100 ul where the 
DNA concentration is approximately 20 ng/jxl. 1.25 units of Taq polymerase and 1.25 units of Pfu polymerase are 
added. Each of the 5 tubes corresponding to cDNA fragments and chimeric oligonucleotides for genes 1-5 H 2 are 
subjected to a themocycling program of 94°C for 60 seconds one time, followed by 40 cycles of 94°C for 30 

20 seconds, 55°C for 30 seconds, and 72°C for 30 seconds, followed by a one time incubation of 72°C for 5 minutes. 
10 ul from each reaction is brought up to 100 ul in new PCR tubes in 0.2 mM of each dNTP, 2.2 mM MgCl 2 , 50 
mM KC1, 10 mM Tris»HCl pH 9.0, 0.1% Triton X-100, 8 fxM of primers corresponding to unique sequences and the 
complements of unique sequences at the ends of each cDNA fragment, and 1.25 units of Taq polymerase and 1.25 
units of Pfu polymerase. Shuffled 1 H 2 genes are amplified by primers corresponding to unique sequence a and the 

25 complement of unique sequence b. Shuffled 2 H 2 genes are amplified by primers corresponding to unique sequence 
c and the complement of unique sequence d. Shuffled 3 H 2 genes are amplified by primers corresponding to unique 
sequence e and the complement of unique sequence f. Shuffled 4 H 2 genes are amplified by primers corresponding 
to unique sequence g and the complement of unique sequence h. Shuffled 5 H 2 genes are amplified by primers 
corresponding to unique sequence i and the complement of unique sequence j. The amplification reactions are 

30 performed in a thermocycler for a program of 94°C for 60 seconds one time, followed by 20 cycles of 94°C for 30 
seconds, 55°C for 30 seconds, and 72°C for 30 seconds, followed by a one time incubation of 72°C for 5 minutes. 
PCR products, now referred to as the 1 H 2 shuffled library, the 2 H 2 shuffled library, etc., are gel purified, 
electroeluted, precipitated, and resuspended in water. 

[141] Step 13: Synthesis of test constructs : Equimolar amounts of nonshuffled segments I- VIII and 1-5 H 2 shuffled 
35 libraries are added together in a new primerless PCR reaction. 1 pmol each of nonshuffled segment I, nonshuffled 
segment II, nonshuffled segment III, nonshuffled segment IV, nonshuffled segment V, nonshuffled segment VI, 
nonshuffled segment VII, nonshuffled segment VIII, 1 H 2 shuffled library, 2 H 2 shuffled library, 3 H 2 shuffled 
library, 4 H 2 shuffled library, and 5 H 2 shuffled library are brought up to a volume of 100 ul in 0.2 mM of each 
dNTP, 2.2 mM MgCl 2 , 50 mM KC1, 10 mM Tris^HCl pH 9.0, 0.1% Triton X-100, with 2.5 units of Pfu DNA 
40 polymerase. The reaction is subjected to a themocycling program of 94°C for 60 seconds one time, followed by 40 
cycles of 94°C for 30 seconds, 55°C for 30 seconds, and 72°C for 30 seconds, followed by a one time incubation of 
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72°C for 5 minutes. Double stranded primerless PGR products, now referred to as 1-5 H 2 test constructs, are 
separated from oligonucleotides and fragments by gel electrophoresis and products of the expected size are 
electroeluted, precipitated, and resuspended in sterile water. 

[142] Step 14: Transformation of cells with mutagenized nucleic acid sequences : The Chlamydomonas reinhardtii 
5 strain CC-400 (a cell wall deficient strain, Chlamydomonas Genetics Center, Duke University) is grown with 
shaking in TAP media (Harris, (1989) The Chlamydomonas Sourcebook. Academic Press, New York; Gorman, 
Proc Natl Acad Sci U S A (1965) Dec ;54(6): 1665-9) until the cells reach a density of approximately 2 x 10 6 
cells/ml. The cells are pelleted at 4000 x g for 5 minutes and the supernatant is removed. The cell pellet is 
resuspended in 7.5 ml per liter of original culture of TAP medium. The following components are added, in order, 

10 to 25 sterile tubes: 300 ul of cells, 1 [xg of 1-5 H 2 test construct, 100 pi of sterile-filtered 20% PEG, 300 mg of 
sterile glass beads (prepared according to Kindle, Meth Enzymology (1998) 297: 27-38). Each tube is vortexed 15- 
30 seconds at high speed. The cells are removed from the tube and spread onto plates containing phleomycin 
(Stevens, Mol Gen Genet (1996) Apr 24;251(l):23-30). Plates are incubated in low light (approximately 5 \xE m 2 
s" 2 ) at 25°C for 4-6 days in atmospheric air until colonies appear. 

15 [143] Step 15: Screening for increased amounts of hydrogen : Phleomycin resistant colonies are transferred to new 
plates containing identical culture media. Colonies are plated in 96-colony grids. Replica plates are also made and 
stored at 15°C in low light. The 96-colony plates, made of clear plastic, are incubated in low light (approximately 5 
u.E ni 2 s" 2 ) at 25°C in atmospheric air for until colonies are approximately 3 mm in diameter. Chlamydomonas 
reinhardtii strain CC-400 is used as a control on each 96-colony plate. After colonies have grown to the desired 

20 size, 3 mm thick filter paper is placed over the plate, covering the colonies. A chemochromic film containing 
tungsten trioxide is placed on top of the filter paper (Seibert). A rectangular clear plastic grid design is placed 
directly over the chemochromic film such that the center of each square on the grid is directly over the center of a 
cell colony. The plates are incubated in light (approximately 55 \iE m" 2 s" 2 ) at 25°C in 5% oxygen for 12 hours. 
The plates are illuminated from above and below. After 12 hours, each plate is photographed from the top using a 

25 digital camera within 5 seconds of removal from the incubation chamber. The images are scanned by densitometry 
and are subsequently screened for dark spots on the chemochromic film that indicate the production of hydrogen. 
Spots that are quantitatively darker than spots directly over control colonies of nontransformed Chlamydomonas 
reinhardtii strain CC-400 indicate cells that generate an increased amount of hydrogen. These colonies are 
recovered from the test plates or the replica plates. 

30 

EXAMPLE 2 

[144] Step 1: Sequence design : Unique sequences a-h were searched for similarity to known sequences in the 
Chlamydomonas genome using the WU-Blast 2.0 program on databases of the Chlamydomonas Genome Project, 
located at (http:// www. biology. duke.edu/chlamy_genome/blastA)last_form.html). The search produced no high 

35 scoring segment pairs. The following databases were searched: Contig Set, EST clones, S1D2 ESTs, Volvocales 
(non-EST), and BAC-ends (JGI). Searches were performed using the WU-blastn program using the default matrix 
blosum62. Gapped alignments were allowed for. The default expected threshold, filter, word length, and cutoff 
scores were used. The sum statistics option was used for assessing the significance of aligned pairs. Primer and 
chimeric oligonucleotide sequences were designed using sequences from the lhcbl gene promoter (SEQ ID 148), 

40 the 3' untranslated region of the RBCS2 gene (SEQ ID 150), and a green fluorescent protein gene (SEQ ID 179). 

[145] Step 2: Obtaining cDNA sequences : cDNA sequences are obtained, using methods previously disclosed, for: 

31 
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Chlamydomonas reinhardtii ferredoxin (Genbank accession number LI 0349, SEQ ID NO 172); Chlamydomonas 
reinhardtii hydrogenase (Genbank accession number AF289201, SEQ ID NO 173); Scenedesmus obliquus 
hydrogenase (Genbank accession number AJ271546, SEQ ID NO 177), and Chlorella fusca hydrogenase (Genbank 
accession number AJ298227, SEQ ID NO 178). cDNA sequences are identified using synthetic oligonucleotides 
5 corresponding to GenBank sequences as probes. 

[146] The coding region of each of the 3 iron hydrogenase genes is amplified using the cDNA plasmid as template 
and primers corresponding to the N and complement of the C terminal portions of the coding regions of the cDNA 
sequences. PGR products corresponding to the coding regions of the 6 hydrogenase genes are gel-purified, 
electroeluted, precipitated and resuspended in 50 mM Tris^HCl pH 7.4, 1 mM MgCl 2 . Alternatively PGR primers 
10 are removed from the reaction using the Wizard® PCR product and the PGR products are resuspended in 50 mM 
Tris^HCl pH 7.4, 1 mM MgCl 2 . Chimeric oligonucleotides are synthesized according to Table 4 and are 
resuspended in 50 mM Tris*HCl pH 7.4, 1 mM MgCl 2 . 

[147] Step 3: Shuffling of hydrogenase coding regions : PCR products corresponding to the coding regions of the 6 
hydrogenase genes are quantified using spectrophotometry. Equal molar amounts of each PCR product are pooled 

15 to obtain a total of 4 ug DNA in 100 u-L 50 mM Tris*HCl pH 7.4, 1 mM MgCl 2 . DNAse I is added at a 
concentration of 0.15 units of Dnase I per 100 ul of reaction volume. The digestion reaction proceeds for 15 
minutes at room temperature and is stopped. Digestion products from approximately 20-150 base pairs are purified 
from 2% low melting agarose gels, electroeluted, precipitated, and resuspended in water. 0.7123 pmol of chimeric 
oligonucleotides 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 12.1, 12.2, 12.3, 12.4 12.5, and 12.6 are added to each tube. 

20 Chimeric oligonucleotides and 20-150 base pair hydrogenase coding region fragments are resuspended in 0.2 mM 
of each dNTP, 2.2 mM MgCl 2 , 50 mM KC1, 10 mM Tris-HCl pH 9.0, 0.1% Triton X-100, to a volume of 100 ^1 
where the DNA concentration is approximately 20 ng/ul. 1.25 units of Taq polymerase and 1.25 units of Pfu 
polymerase are added. The reaction is subjected to a themocycling program of 94°C for 60 seconds one time, 
followed by 40 cycles of 94°C for 30 seconds, 55°C for 30 seconds, and 72°C for 30 seconds, followed by a one 

25 time incubation of 72°C for 5 minutes. 10 ul from the reaction is brought up to 100 ul in new PCR tubes in 0.2 mM 
of each dNTP, 2.2 mM MgCl 2 , 50 mM KC1, 10 mM Tris-HCl pH 9.0, 0.1% Triton X-100, 8 \iM of unique sequence 
b and the complement of unique sequence c primers, and 1.25 units of Taq polymerase and 1.25 units of Pfu 
polymerase. The amplification reaction is performed in a thermocycler for a program of 94°C for 60 seconds one 
time, followed by 20 cycles of 94°C for 30 seconds, 55°C for 30 seconds, and 72°C for 30 seconds, followed by a 

30 one time incubation of 72°C for 5 minutes. PCR products, now referred to as the hydrogenase shuffled library, are 
gel purified, electroeluted, precipitated, and resuspended in water. 

[148] Step 4: Error-prone PCR of ferredoxin : The Chlamydomonas reinhardtii ferredoxin coding region (SEQ ID 
NO 172) is amplified by PCR using primers corresponding to the N terminal and complement of the C terminal 
ends of the coding region. The coding region PCR product is then subjected to PCR using chimeric 

35 oligonucleotides 13 and 14. The PCR product, consisting of the Chlamydomonas reinhardtii ferredoxin coding 
region flanked by unique sequences d and e, is then subjected to error-prone PCR. The error-prone PCR is 
performed using unique sequence d and the complement of unique sequence e as primers at a concentration of 1 uM 
each, in a reaction also containing: 50 ng template (ferredoxin fragment flanked by unique sequences d and e), 20 
mM Tris pH 8.4, 0.3 mM MnCl 2 , 3 mM MgCl 2 , 50 mM KC1, 0.01% gelatin, 0.2 mM dATP, 1 mM dCTP, 1 mM 

40 dGTP, 1 mM dTTP, 1 U AmpliTaq polymerase (Perkin Elmer, Foster City, CA), essentially according to the 
method of Leung, Technique (1989) 1, 11-15. The PCR products, now referred to as the ferredoxin library, is gel 
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purified, electroeluted, precipitated, and resuspended in water. 

[149] Step 5: Construction of nonshuffled segments : Nonshuffled segments IX, X, XI, XII, and XIII are generated 
through PCR amplification using primers and templates listed in Table 3. The position of these primers relative to 
the sequence information they contain (not drawn to scale) is depicted in Figure 7 by arrows. Nonshuffled 
5 segments IX, X, XI, XII, and XIII are gel purified, electroeluted, and precipitated. The fragments are resuspended 
in water. 

[150] Step 6: Construction of hydro genase-ferredoxin test construct library: Equimolar amounts of nonshuffled 
segments IX, X, XI, XII, and XIII, the hydrogenase shuffled library and the ferredoxin library are added together in 
a new primerless PCR reaction. 1 pmol each of nonshuffled segments IX, X, XI, XII, and XIII, the hydrogenase 

10 shuffled library, and the ferredoxin library are brought up to a volume of 100 ul in 0.2 iriM of each dNTP, 2.2 mM 
MgCl 2 , 50 mM KC1, 10 mM Tris»HCl pH 9.0, 0.1% Triton X-100, with 2.5 units of Pfu DNA polymerase. The 
reaction is subjected to a themocycling program of 94°C for 60 seconds one time, followed by 40 cycles of 94°C for 
30 seconds, 55°C for 30 seconds, and 72°C for 30 seconds, followed by a one time incubation of 72°C for 5 minutes. 
Double stranded primerless PCR products, now referred to as hydrogenase-ferredoxin test construct library, are 

15 separated from oligonucleotides and fragments by gel electrophoresis and products of the expected size are 
electroeluted, precipitated, and resuspended in sterile water. 

[151] Step 7: Transformation of cells : The Chlamydomonas reinhardtii strain cc-400 is grown with shaking in 
TAP media (Harris, (1989) The Chlamydomonas Sourcebook. Academic Press, New York; Gorman , Proc Natl 
Acad Sci U S A (1965) Dec;54(6): 1665-9) until the cells reach a density of approximately 2 x 10 6 cells/ml. The 

20 cells are pelleted at 4000 x g f or 5 minutes and the supernatant is removed. The cell pellet is resuspended in 7.5 ml 
per liter of original culture of TAP medium. The following components are added, in order, to 25 sterile tubes: 300 
ul of cells, 1 \ig of hydrogenase-ferredoxin test construct, 100 ul of sterile-filtered 20% PEG, 300 mg of sterile 
glass beads (prepared according to Kindle, Meth Enzymology (1998) 297: 27-38). Each tube is vortexed 15-30 
seconds at high speed. The cells are removed from the tube and are cultured in TAP media under continuous 

25 illumination (approximately 55 \iE m" 2 s" 2 ) at 25°C for 12 hours. 

[152] Step 8: Screening cells for generation of hydrogen : Cells in media are illuminated with 395 nm light and 
monitored for emission at 525 nm using fluorescence-activated cell sorting (Bloodgood et al. Exp Cell Res 1987 
Dec;173(2):572-85; Hegemann). Colonies exhibiting 525nm GFP emission are recovered from the sorting protocol 
and are plated in 96-colony grids on solid media. Replica plates are also made and stored at 15°C in low light. The 

30 96-colony plates, made of clear plastic, are incubated in low light (approximately 5 u.E m" 2 s" 2 ) at 25°C in 
atmospheric air until colonies are approximately 3 mm in diameter. Chlamydomonas reinhardtii strain cc-400 is 
used as a control on each 96-colony plate. After colonies have grown to the desired size, 3 mm thick filter paper is 
placed over the plate, covering the colonies. A chemocliromic film containing tungsten trioxide is placed on top of 
the filter paper (Seibert). A rectangular clear plastic grid design is placed directly over the chemochromic film such 

35 that the center of each square on the grid is directly over the center of a cell colony. The plates are incubated in 
light (approximately 55 \iB m" 2 s' 2 ) at 25°C in atmospheric air for 12 hours. The plates are illuminated from above 
and below. After 12 hours, each plate is photographed from the top using a digital camera within 5 seconds of 
removal from the incubation chamber. The images are scanned by densitometry and are subsequently screened for 
dark spots on the chemochromic film that indicate the production of hydrogen. Spots that are quantitatively darker 

40 than spots directly over control colonies of nontransformed Chlamydomonas reinhardtii strain cc-400 indicate cells 
that generate an increased amount of hydrogen. These colonies are recovered from the test plates or the replica 
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plates. 

[153] Step 9: Isolation and further mutagenesis of hydro genase-ferredoxin test constructs that cause increased 
production of hydrogen : Total DNA is isolated from the 5% of all transformant colonies exhibiting the highest 
level of hydrogen production. Hydrogenase-ferredoxin test constructs are recovered from the DNA by PGR using 
5 primers corresponding to unique sequence a and the complement of unique sequence h. PGR products are gel 
purified, electroeluted, precipitated, and resuspended in water. 

[154] The hydrogenase-ferredoxin test constructs are quantified using spectrophotometry. Equimolar amounts of 
each recovered test construct are added to a total of 4 jug of test construct and are diluted to 100 juL to yield a 
reaction tube containing 50 mM Tris-HCl pH 7.4, 1 mM MgCl 2 . DNAse I is added at a concentration of 0.15 units 

10 of Dnase I per 100 ul of reaction volume. The digestion reaction proceeds for 15 minutes at room temperature. 
Digestion products from approximately 20-150 base pairs are purified from 2% low melting agarose gels, 
electroeluted, precipitated, and resuspended in 0.2 mM of each dNTP, 2.2 mM MgCl 2 , 50 mM KC1, 10 mM 
Tris»HCl pH 9.0, 0.1% Triton X-100, to a volume of 100 [xl where the DNA concentration is approximately 20 
ng/ul. 1.25 units of Taq polymerase and 1.25 units of Pfu polymerase are added. The reaction is subjected to a 

15 themocycling program of 94°C for 60 seconds one time, followed by 40 cycles of 94°C for 30 seconds, 55°C for 30 
seconds, and 72°C for 30 seconds, followed by a one time incubation of 72°C for 5 minutes. 10 |xl from the reaction 
is brought up to 100 ul in new PGR tubes in 0.2 mM of each dNTP, 2.2 mM MgCl 2 , 50 mM KC1, 10 mM Tris^HCl 
pH 9.0, 0.1% Triton X-100, 8 {iM of unique sequence a and the complement of unique sequence h primers, 1.25 
units of Taq polymerase and 1.25 units of Pfu polymerase. The amplification reaction is performed in a 

20 thermocycier for with a program of 94°C for 60 seconds one time, followed by 20 cycles of 94°C for 30 seconds, 
55°C for 30 seconds, and 72°C for 30 seconds, followed by a one time incubation of 72°C for 5 minutes. PGR 
products, now referred to as the hydrogenase-ferredoxin secondary test constructs, are gel purified, electroeluted, 
precipitated, and resuspended in sterile water. 

[155] Step 10: Transformation of cells : The Chlamydomonas reinhardtii strain cc-400 is grown with shaking in 
25 TAP media (Harris, (1989) The Chlamydomonas Sourcebook. Academic Press, New York; Gorman , Proc Natl 
Acad Sci U S A (1965) Dec;54(6): 1665-9) until the cells reach a density of approximately 2 x 10 6 cells/ml. The 
cells are pelleted at 4000 x g for 5 minutes and the supernatant is removed. The cell pellet is resuspended in 7.5 ml 
per liter of original culture of TAP medium. The following components are added, in order, to 25 sterile tubes: 300 
ul of cells, 1 u.g of hydrogenase-ferredoxin secondary test construct, 100 ul of sterile-filtered 20% PEG, 300 mg of 
30 sterile glass beads (prepared according to Kindle, Meth Enzymology (1998) 297: 27-38). Each tube is vortexed 15- 
30 seconds at high speed. The cells are removed from the tube and are cultured in TAP media under continuous 
illumination (approximately 55 uE m" 2 s" 2 ) at 25°C for 12 hours. 

[156] Step 1 1 : Screening cells for generation of hydrogen : Cells in media are illuminated with 395 nm light and 
monitored for emission at 525 nm using fluorescence-activated cell sorting (Bloodgood et al. Exp Cell Res 1987 

35 Dec;173(2):572-85; Hegemann). Colonies exhibiting 525nm GFP emission are recovered from the sorting protocol 
and are plated in 96-colony grids on solid media. Replica plates are also made and stored at 15°C in low light. The 
96-colony plates, made of clear plastic, are incubated in low light (approximately 5 uE m" 2 s" 2 ) at 25°C in 
atmospheric air until colonies are approximately 3 mm in diameter. Chlamydomonas reinhardtii strain cc-400 is 
used as a control on each 96-colony plate. After colonies have grown to the desired size, 3 mm thick filter paper is 

40 placed over the plate, covering the colonies. A chemochromic film containing tungsten trioxide is placed on top of 
the filter paper (Seibert). A rectangular clear plastic grid design is placed directly over the chemochromic film such 
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that the center of each square on the grid is directly over the center of a cell colony. The plates are incubated in 
light (approximately 55 \iE m" 2 s" 2 ) at 25°C in atmospheric air for 12 hours. The plates are illuminated from above 
and below. After 12 hours, each plate is photographed from the top using a digital camera within 5 seconds of 
removal from the incubation chamber. The images are scanned by densitometry and are subsequently screened for 
5 dark spots on the chemochromic film that indicate the production of hydrogen. Spots that are quantitatively darker 
than spots directly over control colonies of nontransformed Chlamydomonas reinhardtii strain cc-400 indicate cells 
that generate an increased amount of hydrogen. These colonies are recovered and are used for hydrogen production 
and/or further development. 

10 EXAMPLE 3 

Multiparental Mating Protocol 

[157] 1. Place cells from 3 or more strains of algae capable of mating to each other such as Chlamydomonas 
reinhardtii together in the same tube, where at least one strain is of a different mating type than at least one other 
strain. For example, place approximately the same number of cells of the following strains into the tube: CC-124, 

15 CC-125, CC-1690, CC-1692, CC-407, CC-408, CC-1952, CC-2290, CC-2342, CC-2343, CC-2344, CC-2931, CO 
2932, CC-2935, CC-2936, CC-2937, CC-2938, CC-2935, CC-2936, CC-2937, CC-2938, CC-3059, CC-3060, CO 
3061, CC-3062, CC-3063, CC-3064, CC-3065, CC-3067, CC-3068, CC-3071, CC-3073, CC-3074, CC-3075, CC- 
3076, CC-3078, CC-3079, CC-3080, CC-3082, CC-3083, CC-3084, CC-3086, CC-1373 and CC-3087. 
[158] 2. Suspend the cells nitrogen free medium, such as Sueoka's medium without NH 4 C1. 

20 [159] 3. Incubate in light, for 12 hours, or for 1 day, or 2 days, or 3 days, or 4 days, or for 5, 6, 7, 8, 9, 10, or more 
days, or for fractions of the aforementioned numbers of days. 

[160] 4. Add nitrogen (such as NH 4 C1) to media or move cells into nitrogen containing media and incubate in 
light, for 12 hours, or for 1 day, or 2 days, or 3 days, or 4 days, or for 5, 6, 7, 8, 9, 10, or more days, or for fractions 
of the aforementioned numbers of days. 
25 [161] 5. Collect cells and change media back to nitrogen free and incubate in light for 12 hours, or for 1 day, or 2 
days, or 3 days, or 4 days, or for 5, 6, 7, 8, 9, 10, or more days, or for fractions of the aforementioned numbers of 
days. 

[162] 6. Repeat steps 4-5 as any times as desired. 

[163] 7. Plate mating reaction on solid media (or optionally sort cells individually with a cell sorter) and pick 
30 colonies. 

[164] 8. Array strains from colonies into multiwell plates containing liquid culture media. 
[165] 9. Screen or select for a desired phenotype. 

[166] 10. Identify 3 or more novel strains from step 9 that have the desired phenotype. 
[167] 1 1. Repeat steps 1-9 as many times as desired. 

35 

[168] To make 1 liter of Sueoka's high salt media* : 
Phosphate Buffer 50 mis 

Beijerinck's stock 50 mis 

Hutner's trace elements (see TAP) 1 ml 

40 Sodium acetate 2.0 g( 1 .2 g if anhydrous) 
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[169] Phosphate Buffer 

Component 

K 2 HP0 4 

KH 2 PG 4 



14.4 g 



For 1 liter 



28.8 g 



5 



[170] Beijerinck's stock 



Component 
NH4CI 



for 1 liter 



10 



MgS0 4 -7H 2 0 
CaCl 2 2H 2 0 



10g 

0.4g 

0.2g 



*Media for inducing gametogenesis can be made by withholding NH4CI from the Beijerinck's stock. 

EXAMPLE 4 
15 Gene Reassembly 

[171] The process of chimeric gene assembly is depicted in figures 13-14. Sections of the active site region that 
are both highly conserved and correspond to the gas channel were identified using structural data, as shown in 
figure 9. In step 1 of figure 13, a library of approximately 110 unique Iron hydrogenase amino acid sequences was 
aligned using sequence manipulation software (DS Gene 1.5, Accelyrys Inc., San Diego, CA). The key in figure 15 

20 shows the identity of amino acids from step 1 and codons from steps 2-9. In step 2, peptide sequences of conserved 
gas channel segments were reverse-translated into single stranded oligonucleotide sequences using C. reinhardtii 
most preferred codons from Figure 10. All bars in step 1 correspond to amino acids of aligned iron hydrogenases. 
All bars in steps 2-9 correspond to codons that encode the amino acids from the bars of step 1. Each bar in steps 2- 
9 therefore depicts a codon triplet of oligonucleotide sequence. In step 3, three codons encoding amino acids that 

25 flank each side of the conserved gas channel segments were re-written to encode the corresponding C. reinhardtii 
amino acids in those flanking positions. Each oligonucleotide of step 3 therefore encodes (from left to right) three 
C. reinhardtii codons that flank the N-terminal side of a gas channel segment, followed by codons corresponding to 
a non-C reinhardtii gas channel segment, followed by three C. reinhardtii codons that flank the C-terminal side of 
the gas channel segment. Even though these oligonucleotides encode different sequences from the C. reinhardtii 

30 Iron hydrogenase, the combination of recoding and the substitution of 3 flanking codons on either side of the gas 
channel segment generates enough nucleotide similarity that these oligonucleotides anneal to a complementary 
strand encoding the recoded, wild-type C. reinhardtii Iron hydrogenase. In step 4, the entire set of receded 
oligonucleotides is mixed and annealed to single stranded "scaffold" DNA molecules that encode the wild type C. 
reinhardtii Iron hydrogenase protein in recoded form. Recoding the wild type G reinhardtii iron-hydrogenase to 

35 make the scaffold achieves maximum sequence identity between the scaffold and the recoded oligonucleotides 
because the wild type G reinhardtii Iron hydrogenase gene does not contain only the most highly preferred codons. 
Oligonucleotides corresponding to wild type G reinhardtii gas channel segments with single residue substitutions 
designed to narrow the gas channel can also be mixed into in the annealing reaction. The single stranded scaffold 
molecule is generated by isolating the gene from a plasmid grown in a methylating host cell, followed by 

40 denaturation and separation of the strands by HPLC or other standard procedures, as described for example in U.S. 
patent 6,361,974. None of the primers anneal to partially overlapping sites on the C. reinhardtii strand. No 
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exonuclease treatment is needed to "clip" strands partially displaced by annealing of other oligonucleotide. In step 
5 of figure 14, different combinations of diverse gas channel segments anneal to each full length complementary 
strand. Each oligonucleotide has at least 9 perfect base pairs on both ends, ensuring sufficient annealing despite 
internal mismatches due to sequence variation of the gas channel segments. Addition of DNA Polymerase in step 6 
5 extends the annealed oligonucleotides, creating a combinatorial library of double stranded hybrid Iron hydrogenase 
molecules with numerous mismatches at "context" residue positions. Preferably the DNA Polymerase is 
exonuclease-deficient to prevent it from degrading parts of annealed primers in its path as it extends between 
annealed primers. In step 7, the methylated strands are digested using a methylation-sensitive endonuclease, as 
described for example in U.S. patent 6,361,974. An alternative method for separating the scaffold strands from the 

10 library strands is to use a biotinylated C-terrninal primer and separate the library strands using immobilized 
streptavidin. In steps 8-9, an N and C terminal C. reinhardtii primers and DNA Polymerase are added to the library 
of novel Iron hydrogenase molecules for a single round of amplification. The result is a library of double stranded 
Iron hydrogenase sequences that have random combinations of functional gas channel segments but C. reinhardtii 
framework/hinge regions. The library is be cloned into C. reinhardtii cells and assayed for catalytic activity in the 

15 presence of 0 2 . Library members identified as active in the presence of 0 2 are sequenced and a new library is made 
using the above method and oligonucleotides designed to anneal to a representative single stranded Iron 
hydrogenase identified from the first library. The screening process on the second library is performed in the 
presence of an additional amount of oxygen compared to the first round. This gene reassembly procdure can be 
used to mutagenize any nucleic acid sequence. 

20 



TABLE 1 



Product 


5' primer 


5' primer 
sequence 


3' primer 


3' primer sequence 


Template 


Nonshuffled 
segment I 


First 24 
nucleotides 
of promoter 
fragment of 
the lhcbl 
gene 


5' gcagttgggtca 
ggggctggcgac y 


Complement 
of unique 
sequence a- 
complement 
of last 25 base 
pairs of the 
promoter 
fragment of 
the lhcbl gene 


5'gctaagatggcc 
ataaggataactac 
ggattaacgaaatg 
agtctcgcccgcggc 3' 


SEQ ID NO 
148 


Nonshuffled 
segment II 


Unique 
sequence b- 
first 25 
nucleotides 
of 3' UTR 
from 
RBCS2 
gene 


5' cgtgcatcgattaa 
cagcttctggacctga 
ccgacgtcgaccca 
ctctagaggat 3' 


Complement 
of unique 
sequence c- 
complement 
of last 25 base 
pairs of the 
promoter 
fragment of 
the lhcbl gene 


5* cttagtcatacttg 
gacgtacgacgttta 
ataacgaaatgagt 
ctcgcccgcggc 3' 


SEQ ID NO 
151 
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Nonshuffled 


Unique 


5' aatctgatac 


Complement 


5' agttacgatttact 


SEQ ID NO 


segment III 


sequence d- 


atgctattca 


of unique 


agtcgagtagacat 


151 




first 25 


gatcttacaa 


sequence e- 


tttaacgaaatgag 






nucleotides 


ccgacgtcgaccca 


complement 


tctcgcccgcggc 3' 






of 3' UTR 


ctctagaggat 3' 


of last 25 base 








from 




pairs of the 








RBCS2 




promoter 








gene 




fragment of 
the lhcbl gene 






Nonshuffled 


Unique 


5' atctgtaata 


Complement 


5' 


SEQ ID NO 


segment IV 


sequence f- 


atctagtcga 


of unique 


cgaatcctcgttag 


150 




first 25 


ggcattcaag 


sequence k- 


taactattccgactac 






nucleotides 


ccgacgtcgaccca 


complement 


caaatacgccca 






of 3' UTR 


ctctagaggat 3' 


of last 24 


gcccgcccatgg 3' 






from 




nucleotides of 








RBCS2 




3' UTR from 








gene 




RBCS2 gene 






Nonshuffled 


Unique 


5' 


Complement 


5' 


SEQ ID NO 


segment V 


sequence k- 


gtagtcggaatagtt 


of unique 


agttacgatttactag 


149 




First 25 


actaacgaggattc g 


sequence 1- 


tcgagtagacattt 






nucleotides 


gccagaaggag 


complement 


ggtaccgggccc 






of the ble 


cgcagccaaaccag 


of last 25 


cccctcgagtta 3' 






selectable 


3' 


nucleotides of 








marker 




the ble 








cassette 




selectable 

marker 

cassette 






Nonshuffled 


Unique 


5' 


Complement 


5' tcacacgattg 


SEQ ID NO 


segment VI 


sequence 1- 


aaatgtctactcgac 


of unique 


ttaacgatttaag 


148 




first 24 


tagtaaatcgtaact 


sequence g- 


ccagtttaacgaaat 






nucleotides 


gcagttgggtca 


complement 


gagtctcgcccgcggc 3' 






of promoter 


ggggctggcgac 3' 


of last 25 








fragment of 




nucleotides of 








the lhcbl 




promoter 








gene 




fragment of 
the lhcbl gene 






Nonshuffled 


Unique 


5' gatttaacat 


Complement 


5' ttgtcaccagga 


SEQ ID NO 


segment VII 


sequence h- 


aactgtcgat 


of unique 


ttacgattgtcaagc 


151 




first 25 


taccgtgcga 


sequence i- 


atataacgaaatga 






nucleotides 


ccgacgtcgaccca 


complement 


gtctcgcccgcggc 3' 






of 3' UTR 


ctctagaggat 3' 


of last 25 








from 




nucleotides of 








RBCS2 




promoter 








gene 




fragment of 
the lhcbl gene 






Nonshuffled 


Unique 


5' taacaagaat 


Complement 


5' caaatacgccca 


SEQ ID NO 


segment VIII 


sequence j- 
flrst 25 
nucleotides 
of y UTR 

from 

RBCS2 

gene 


ctggctaatc 
aatcgatgca 
ccgacgtcgaccca 
ctctagaggat 3' 


of last 24 
nucleotides of 
3' UTR from 
RBCS2 gene 


gcccgcccatgg 3' 


150 



[172] Table 2 Key to nomenclature: Chimeric oligonucleotides are designed according to sequences derived from 
the 5' and 3' ends of the 70 cDNAs of the 1-5 H 2 set. All portions of chimeric oligonucleotides corresponding to 
the 5' end of a cDNA start with a start codon. For instance, the oligonucleotide LI from Table 1 has a sequence of 
5 5 J atccgtagttatccttatggccatcttagc-«^/"cpw/7/i2727 3\ This oligonucleotide's first 30 nucleotides, reading from 5' to 
3% encode unique sequence a (SEQ ID NO 152). Nucleotides 31-33 encode a start codon (atg). After the start 
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codon the sequence is from the 5' end of the Chlamydomonas puivinata 1 H 2 gene coding sequence, beginning after 
the start codon. Sequence listed in italics corresponds to the portion of the description written in italics. All 
portions of chimeric oligonucleotides corresponding to the 3' end of a cDNA end with a stop codon. For instance, 
the oligonucleotide 2.1 from Table 1 has a sequence of 5' [cpn\lbl\2it^-cgtgcatcgattaacagcttctggacctga 3\ This 
5 oligonucleotide's first 27 nucleotides, reading from 5' to 3', encode the last 27 nucleotides of the Chlamydomonas 
puivinata 1 H 2 gene coding sequence, followed by a stop codon. After the stop codon the sequence is unique 
sequence b (SEQ ID NO 153). 



TABLR 2 



Oligo 


5' end corresponding to: 


3' end corresponding to: 


Sequence 


# 








1.1 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas puivinata 1 
H 2 gene coding sequence 


5 ' atccgtagttatccttatggccatcttagc- 
atg[cpullh2] 27 y 


1.2 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas pygmaea 1 
H 2 gene coding sequence 


5 9 atccgtagttatccttatggccatcttagc- 
atg[cpyglh2] 27 V 


1.3 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas radiata 1 H 2 
gene coding sequence 


5 ' atccgtagttatccttatggccatcttagc- 
atg[cradlh2] 27 V 


1.4 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas rapa 1 H 2 
gene coding sequence 


5 9 atccgtagttatccttatggccatcttagc- 
atg[craplh2] 27 3' 


1.5 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas sajao 1 H 2 
_gene coding sequence 


5 ' atccgtagttatccttatggccatcttagc- 
atg[csajlh2] 27 3 9 


1.6 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas segnis * 1 
H 2 gene coding sequence 


5 * atccgtagttatccttatggccatcttagc- 
atg[cseg 222 lh2] 27 y 


1.7 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5 7 end of 
Chlamydomonas segnis 1638 1 
H 2 gene coding sequence 


5 ' atccgtagttatccttatggccatcttagc- 
atg[cseg 1638 lh2] 27 y 


1.8 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas segnis 1919 1 
H 2 gene coding sequence 


5 9 atccgtagttatccttatggccatc ttagc- 
atg[cseg 1919 lh2] 27 y 


1.9 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas smithii 1 H 2 
gene coding sequence 


5' atccgtagttatccttatggccatc ttagc- 
atg[ csmil h2 ] 27 3 ' 


1.10 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5* end of 
Chlamydomonas sphaeroides 
1 H 2 gene coding sequence 


5' atccgtagttatccttatggccatc ttagc- 
atg[csphlh2] 27 y 


1.11 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5 1 end of 
Chlamydomonas surtseyiensis 
1 H 2 gene coding sequence 


5' atccgtagttatccttatggccatcttagc- 
atg[csurlh2] 27 y 


1.12 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5* end of 
Chlamydomonas ulvaensis 1 
H 2 gene coding sequence 


5' atccgtagttatccttatggccatcttagc- 
atg[culvlh2] 27 y 


1.13 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas 
zimbabwiensis 1 H 2 gene 
coding sequence 


5' atccgtagttatccttatggccatc ttagc- 
atg[ czimlh2 ] 27 3 ' 


1.14 


Unique sequence a (SEQ 
ID NO 152) 


First 30 bp of 5' end of 
Chlamydomonas reinhardtii 1 
H 2 gene coding sequence 


5' atccgtagttatccttatggccatcttagc- 
atg[creilh2] 27 3' 
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2.1 


Last 30 bp of 3' end of 
Chlamydomonas 
pulvinata 1 H 2 gene 
coding sequence 


Unique sequence b CSEO ID 
NO 153) 


5' [cpullh2] 30 -cgtgcatcga 
ttaacagcttctggacctga 3 5 


2.2 


Last 30 bp of 3' end of 
Chlamydomonas 
pygmaea 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [cpyglh2] 2 7taa-cg*gca/cg<z 
ttaacagcttctggacctga 3' 


2.3 


Last 30 bp of 3' end of 
Chlamydomonas radiata 
1 H 2 gene coding 
sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [cradlh2] 27 taa-cgfgcafcga 
ttaacagcttctggacctga 3' 


2.4 


Last 30 bp of 3' end of 
Chlamydomonas rapa 1 
H 2 gene coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [cmplh2] 27taa-cgtgcatcga 
ttaacagcttctggacctga 3' 


2.5 


Last 30 bp of 3' end of 
Chlamydomonas sajao 1 
H 2 gene coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5 s [csajlh2] 27 taa-cg£gcafcga 
ttaacagcttctggacctga 3' 


2.6 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 222 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [cseg z:22 lh2] 21 td&rcgtgcatcga 
ttaacagcttctggacctga 3' 


2.7 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 1638 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [cseg lbJ8 lh2] 27 taa-cg£ga?fcga 
ttaacagcttctggacctga 3' 


2.8 


Last 30 bp of 3 s end of 
Chlamydomonas 
segnis 1919 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [cseg iyiy lh2] 27 taa-cg*gcafcga 
ttaacagcttctggacctga 3' 


2.9 


Last 30 bp of 3' end of 
Chlamydomonas smithii 
1 H 2 gene coding 
sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [csrmlh2] 2-jteiSL-cgtgcatcga 
ttaacagcttctggacctga 3' 


2.10 


Last 30 bp of 3' end of 
Chlamydomonas 
sphaeroides 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [csphlh2] 2 7taa -cgtgcatcga 
ttaacagcttctggacctga 3' 


2.11 


Last 30 bp of 3' end of 
Chlamydomonas 
surtseyiensis 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [csurlh2] 21 \mrCgtgcatcga 
ttaacagcttctggacctga 3' 


2.12 


Last 30 bp of 3' end of 
Chlamydomonas 
ulvaensis 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [culvlh2] zjtaz-cgtgcatcga 
ttaacagcttctggacctga 3' 


2.13 


Last 30 bp of 3' end of 
Chlamydomonas 
zimbabwiensis 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [czimlh2] 21 \z3i-cgtgcatcga 
ttaacagcttctggacctga 3' 


2.14 


Last 30 bp of 3' end of 
Chlamydomonas 
reinhardtii 1 H 2 gene 
coding sequence 


Unique sequence b (SEQ ID 
NO 153) 


5' [creilh2] xiXs^-cgtgcatcga 
ttaacagcttctggacctga 3' 


3.1 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas pulvinata 2 
H 2 gene coding sequence 


5 ' ttaaacgtcgtacgtccaagtataactaag- 


3.2 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of '5' end of 
Chlamydomonas pygmaea 2 
H 2 gene coding sequence 


5 ' ttaaacgtcgtacgtccaagtataactaag- 
atg[cpyglh21 27 y 
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3.3 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas radiata 2 H 2 
gene coding sequence 


5' ttaaacgtcgtacgtccaagtataactaag- 
atg[cradlh2] 27 y 


3.4 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas rapa 2 H 2 
gene coding sequence 


5 9 ttaaacgtcgtacgtccaagtataactaag- 
atg[craplh2] 27 3' 


3.5 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas sajao 2 H 2 
gene coding sequence 


5' ttaaacgtcgtacgtccaagtataactaag- 
atg[csajlh2] 27 y 


3.6 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas segnis 222 2 
H 2 gene coding sequence 


5 ' ttaaacgtcgtacgtccaagtataactaag- 
atg[cseg 222 lh2] 27 y 


3.7 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas segnis 1638 2 
H 2 gene coding sequence 


5 ' ttaaacgtcgtacgtccaagtataactaag- 
atg[cseg I638 lh2] 27 y 


3.8 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5* end of 
Chlamydomonas segnis 1929 2 
H 2 gene coding sequence 


5' ttaaacgtcgtacgtccaagtataactaag- 
atg[cseg 1919 lh2] 27 y 


3.9 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas smithii 2 H 2 
gene coding sequence 


5 ' ttaaacgtcgtacgtccaagtataactaag- 
atg[ csmilh2 ] 27 3 ' 


3.10 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas sphaeroides 
2 H 2 gene coding sequence 


5 ' ttaaacgtcgtacgtccaagtataactaag- 
atg[csphlh2] 27 3' 


3.11 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5' end of 
Chlamydomonas surtseyiensis 
2 H 2 gene coding sequence 


5 ' ttaaacgtcgtacgtccaagtataactaag- 
atg[csurlh2] 27 y 


3.12 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5* end of 
Chlamydomonas ulvaensis 2 
H 2 gene coding sequence 


5' ttaaacgtcgtacgtccaagtataactaag- 
atg[culvlh2] 27 y 


3.13 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of '5* end of 
Chlamydomonas 
zimbabwiensis 2 H 2 gene 
coding sequence 


5 ' ttaaacgtcgtacgtccaagtataactaag- 
atg[czimlh2] 27 3' 


3.14 


Unique sequence c (SEQ 
ID NO 154) 


First 30 bp of 5* end of 
Chlamydomonas reinhardtii 2 
H 2 gene coding sequence 


5' ttaaacgtcgtacgtccaagtataactaag- 
atg[creilh2] 27 y 


4.1 


Last 30 bp of 3' end of 
Chlamydomonas 
pulvinata 1 H 2 gene 
coding sequence 


Unique sequence d (SEQ ID 
NO 155) 


5 ' [cpul2h2] 27taa.-aatctgatac 
atgctattcagatcttacaa 3 ' 


4.2 


Last 30 bp of 3' end of 
Chlamydomonas 
pygmaea 2 H 2 gene 
coding sequence 


Unique sequence d (SEQ ID 
NO 155) 


5' [cpyg2h2] 2 ite&-aatctgatac 
atgctattcagatcttacaa 3 ' 


4.3 


Last 30 bp of 3' end of 
Chlamydomonas radiata 
2 H 2 gene coding 
sequence 


Unique sequence d (SEQ ID 
NO 155) 


5' [crad2h2] xitzaraatctgatac 
atgctattcagatcttacaa 3' 


4.4 


Last 30 bp of 3' end of 
Chlamydomonas rapa 2 
H 2 gene coding sequence 


Unique sequence d (SEQ ID 
NO 155) 


5" [crap2h2] 21 t%araatctgatac 
atgctattcagatcttacaa y 


4.5 


Last 30 bp of 3' end of 
Chlamydomonas sajao 2 
H 2 gene coding sequence 


Unique sequence d ( SEQ ID 
NO 155) 


5' [csaj2h2] nXs&raatctgatac 
atgctattcagatcttacaa 3' 


4.6 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 222 2 H 2 gene 
coding sequence 


Unique sequence d (SEQ ID 
NO 155) 


5' [cseg 222 2h2] 2 itaa.~aatctgatac 
atgctattcagatcttacaa 3 ? 
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4.7 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 1638 2 H 2 gene 
coding sequence 


Unique sequence d ( SEQ ID 
NO 155) 


5' [cseg 16J8 2h2] 21 X.2&-aatctgatac 
atgctattcagatcttacaa 3' 


4.8 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 1919 2 H 2 gene 
coding sequence 


Unique sequence d (SEQ ID 
NO 155) 


5' [cseg my 2h2] xitaa-aatctgatac 
atgctattcagatcttacaa 3' 


4.9 


Last 30 bp of 3' end of 
Chlamydomonas smithii 
2 H 2 gene coding 
sequence 


Unique sequence d (SEQ ID 
NO 155) 


5' [csmi2h2] 21 \&?L-aatctgatac 
atgctattcagatcttacaa 3 ' 


4.10 


Last 30 bp of 3' end of 
Chlamydomonas 
sphaeroides 2 H 2 gene 
coding sequence 


Unique sequence d (SEQ ID 
NO 155) 


5' [csph2h2] 21 \aaraatctgatac 
atgctattcagatcttacaa 3 9 


4.11 


Last 30 bp of 3' end of ; 
Chlamydomonas 
surtseyiensis 2 H 2 gene 
coding sequence 


Unique sequence d ( SEQ ID 
NO 155) 


5' [csur2h2] xiXs^-aatctgatac 
atgctattcagatcttacaa 3 ' 


4.12 


Last 30 bp of 3' end of 
Chlamydomonas 
ulvaensis 2 H 2 gene 
coding sequence 


Unique sequence d (SEQ ID 
NO 155) 


5' [culv2h2] Ti\aaraatctgatac 
atgctattcagatcttacaa 3 ' 


4.13 


Last 30 bp of 3' end of 
Chlamydomonas 
zimbabwiensis 2 H 2 gene 
coding sequence 


Unique sequence d ( SEQ ID 
NO 155) 


5' [czim2h2] ^xaa-aatctgatac 
atgctattcagatcttacaa 3 ' 


4.14 


Last 30 bp of 3' end of 
Chlamydomonas 
reinhardtii 2 H 2 gene 
coding sequence 


Unique sequence d (SEQ ID 
NO 155) 


5' [crei2h2] 21 taa-aatctgatac 
atgctattcagatcttacaa 3' 


5.1 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas pulvinata 3 
H 2 gene coding sequence 


5 9 aaatgtctactcgactagtaaatcgtaact- 
atg[cpul3h2] 2 7y 


5.2 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5 3 end of 
Chlamydomonas pygmaea 3 
H 2 gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atg[cpyg3h21 27 y 


5.3 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas radiata 3 H2 
gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atg[crad3h2] 2 y 3' 


5.4 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas rapa 3 H 2 
gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atgl crap3h2] 2 j 3' 


5.5 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas sajao 3 H 2 
gene coding sequence 


5 ' aaatgtctactcgactagtaaatcgtaact- 
atg[csa}3h2] 27 y 


5.6 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas segnis 3 
H 2 gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atg[cseg 222 3h2] 27 y 


5.7 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 

1638 

Chlamydomonas segnis 3 
Hi gene coding sequence 


5 9 aaatgtctactcgactagtaaatcgtaact- 
atg[cseg m8 3h2] 27 3' 


5.8 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas segnis 3 
H2 gene coding sequence 


5 9 aaatgtctactcgactagtaaatcgtaact- 
atgicseg Dn/>j27 2 


5.9 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of5 y end of 
Chlamydomonas smithii 3 H 2 
gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atg[csmi3h2] 27 y 
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5.10 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas sphaeroides 
3 H 2 gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atg[csph3h2] 2 7 3* 


5.11 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas surtseyiensis 
3 H 2 gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atg[csur3h2 ] 2 7 3 ' 


5.12 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5* end of 
Chlamydomonas ulvaensis 3 
H 2 gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atg[culv3h2] 27 3" 


5.13 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5* end of 
Chlamydomonas 
zimbabwiensis 3 H 2 gene 
coding sequence 


5 9 aaatgtctactcgactagtaaatcgtaact- 
atgl czim3h2 ] 2 j 3 ' 


5.14 


Unique sequence e (SEQ 
ID NO 156) 


First 30 bp of 5' end of 
Chlamydomonas reinhardtii 3 
H 2 gene coding sequence 


5' aaatgtctactcgactagtaaatcgtaact- 
atg[ crei3h2 ] 2 ? 3 9 


6.1 


Last 30 bp of 5 ? end of 
Chlamydomonas 
pulvinata 3 H 2 gene 
coding sequence 


Unique sequence f (SEQ ID 
NO 157) 


5' [cpul3h2] 27 taa-afc*gtetfa 
atctagtcgaggcattcaag 3' 


6.2 


Last 30 bp of 3' end of 
Chlamydomonas 
pygmaea 3 H 2 gene 
coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [cpyg3h2] 2 i^-atctgtaata 
atctagtcgaggcattcaag 3' 


6.3 


Last 30 bp of 3' end of 
Chlamydomonas radiata 
3 H 2 gene coding 
sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [crad3h2] 2 i^^atctgtaata 
atctagtcgaggcattcaag 3' 


6.4 


Last 30 bp of 3' end of 
Chlamydomonas rap a 3 
H 2 gene coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [crap3h2] 2 ytSiSL-atctgtaata 
atctagtcgaggcattcaag 3' 


6.5 


Last 30 bp of 3 J end of 
Chlamydomonas sajao 3 
H 2 gene coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [csaj3h2] xjts&ratctgtaata 
atctagtcgaggcattcaag 3' 


6.6 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 222 3 H 2 gene 
coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [cseg 222 3h2] 21^-atctgtaata 
atctagtcgaggcattcaag 3' 


6.7 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 1638 3 H 2 gene 
coding sequence 


Unique sequence f( SEQ ID 
NO 157 


5 9 [cseg 103S 3h2] 27^-atctgtaata 
atctagtcgaggcattcaag 3' 


6.8 


Last 30 bp of 3* end of 
Chlamydomonas 
segnis 1919 3 H 2 gene 
coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [cseg 19iy 3h2] 2 i^-atctgtaata 
atctagtcgaggcattcaag 3' 


6.9 


Last 30 bp of 3' end of 
Chlamydomonas smithii 
3 H 2 gene coding 
sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [csmi3h2] 2 i^-atctgtaata 
atctagtcgaggcattcaag 3' 


6.10 


Last 30 bp of 3 'end of 
Chlamydomonas 
sphaeroides 3 H 2 gene 
coding sequence 


Unique sequence f ( SEQ ID 
NO 157 


5' [csph3h2] 2itSLSL-atctgtaata 
atctagtcgaggcattcaag 3' 


6.11 


Last 30 bp of 3' end of 
Chlamydomonas 
surtseyiensis 3 H 2 gene 
coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [csur3h2] rfM&ratctgtaata 
atctagtcgaggcattcaag 3' 
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6.12 


Last 30 bp of 3' end of 
Chlamydomonas 
ulvaensis 3 H 2 gene 
coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [culv3h2] 2 it2i^ratctgtaata 
atctagtcgaggcattcaag 3' 


6.13 


Last 30 bp of 3' end of 
Chlamydomonas 
zimbabwiensis 3 H 2 gene 
coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [czim3h2] 21 Xa.%ratctgtaata 
atctagtcgaggcattcaag 3 5 


6.14 


Last 30 bp of 3' end of 
Chlamydomonas 
reinhardtii 3 H 2 gene 
coding sequence 


Unique sequence f (SEQ ID 
NO 157 


5' [crei3h2] 21 \2&-atctgtaata 
atctagtcgaggcattcaag 3' 


7.1 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas pulvinata 4 
H 2 gene coding sequence 


5' aactggcttaaatcgttaacaatcgtgtga- 
atg[cpul4h2] 27 y 


7.2 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas pygmaea 4 
H 2 gene coding sequence 


5' aactggcttaaatcgttaacaatcgtgtga- 
atg[cpyg4h2] 27 y 


7.3 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5 f end of 
Chlamydomonas radiata 4 H 2 
gene coding sequence 


5 ' aactggcttaaatcgttaacaatcgtgtga- 
atg[ crad4h2 ] 27 3 ' 


7.4 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas rapa 4 H 2 
gene coding sequence 


5' aactggcttaaatcgttaacaatcgtgtga- 
atg[crap4h2] 27 V 


7.5 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas sajao 4 H 2 
gene coding sequence 


5 ' aactggcttaaatcgttaacaatcgtgtga- 
atg[csaj4h2] 27 y 


7.6 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas segnis 222 4 
H 2 gene coding sequence 


5 ' aactggcttaaatcgttaacaatcgtgtga- 
atg[cseg 222 4h2] 27 y 


7.7 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas segnis 2638 4 
H 2 gene coding sequence 


5 ' aactggcttaaatcgttaacaatcgtgtga- 
atg[cseg 1638 4h2] 27 y 


7.8 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5* end of 
Chlamydomonas segnis 1919 4 
H 2 gene coding sequence 


5 5 aactggcttaaatcgttaacaatcgtgtga- 
atg[cseg 1919 4h2] 27 y 


7.9 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas smithii 4 H 2 
gene coding sequence 


5 ' aactggc ttaaatcgttaacaatcgtgtga- 
atgl csmi4h2 ] 27 y 


7.10 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas sphaeroides 
4 H 2 gene coding sequence 


5 ' aactggcttaaatcgttaacaatcgtgtga- 
atg[csph4h2] 27 3" 


7.11 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas surtseyiensis 
4 Ho gene coding sequence 


5' aactggcttaaatcgttaacaatcgtgtga- 
atg[csur4h2I 27 3* 


7.12 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas ulvaensis 4 
H 2 gene coding sequence 


5' aactggcttaaatcgttaacaatcgtgtga- 
atg[culv4h2] 27 y 


7.13 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas 
zimbabwiensis 4 H 2 gene 
coding sequence 


5 ' aactggcttaaatcgttaacaatcgtgtga- 
atg[ czim4h2 ] 27 3 ' 


7.14 


Unique sequence g (SEQ 
ID NO 158) 


First 30 bp of 5' end of 
Chlamydomonas reinhardtii 4 
H 2 gene coding sequence 


5 ' aactggcttaaatcgttaacaatc gtgtga- 
atg[crei4h2] 27 y 


8.1 


Last 30 bp of 5' end of 
Chlamydomonas 
pulvinata 4 H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [cpul4h2] 2 7taa-gatttaacat 
aactgtcgattaccgtgcga 3' 
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8.2 


Last 30 bp of 3' end of 
Chlamydomonas 
pygmaea 4H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [cpyg4h2] xitnargaffiaacat 
aactgtcgattaccgtgcga 3' 


83 


Last 30 bp of 3 5 end of 
Chlamydomonas radiata 
4 H 2 gene coding 
sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [crad4h2] 21 X.d^gatttaacat 
aactgtcgattaccgtgcga 3' 


8.4 


Last 30 bp of 3* end of 
Chlamydomonas rapa 4 
H 2 gene coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [crap4h2] 21 \assrgatttaacat 
aactgtcgattaccgtgcga 3 ' 


8.5 


Last 30 bp of 3 5 end of 
Chlamydomonas sajao 4 
H 2 gene coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [csaj4h2] 27 taei-gatttaacat 
aactgtcgattaccgtgcga 3* 


8.6 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 222 4 H 2 gene 
coding sequence 


Unique sequence h ( SEQ ID 
NO 159) 


5' [cseg zz2 4h2] 21 taBrgatttaacat 
aactgtcgattaccgtgcga 3' 


8.7 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 1638 4 H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [cseg lw8 4h2] 27 taa-gatttaacat 
aactgtcgattaccgtgcga 3' 


8.8 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 1919 4 H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [cseg iyi9 4h2] 21 taa-gatttaacat 
aactgtcgattaccgtgcga 3' 


8.9 


Last 30 bp of 3' end of 
Chlamydomonas smithii 
4 H 2 gene coding 
sequence 


Unique sequence h ( SEQ ID 
NO 159) 


5' [csmi4h2] 21 te2L-gatttaacat 
aactgtcgattaccgtgcga 3' 


8.10 


Last 30 bp of 3' end of 
Chlamydomonas 
sphaeroides 4 H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [csph4h2] 2itea-gatttaacat 
aactgtcgattaccgtgcga 3' 


8.11 


Last 30 bp of 3' end of 
Chlamydomonas 
surtseyiensis 4 H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [csur4h2] ^Xa^-gatttaacat 
aactgtcgattaccgtgcga 3' 


8.12 


Last 30 bp of 3' end of 
Chlamydomonas 
ulvaensis 4 H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [culv4h2] r^-gatttaacat 
aactgtcgattaccgtgcga 3' 


8.13 


Last 30 bp of 3' end of 
Chlamydomonas 
zimbabwiensis 4 H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [czim4h2] 27 taa-gatttaacat 
aactgtcgattaccgtgcga 3' 


8.14 


Last 30 bp of 3' end of 
Chlamydomonas 
reinhardtii 4 H 2 gene 
coding sequence 


Unique sequence h (SEQ ID 
NO 159) 


5' [crei4h2] 27 tasi-gatttaacat 
aactgtcgattaccgtgcga 3' 


9.1 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas pulvinata 5 
H 2 gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
atgi cpuiDnzj27 -> 


9.2 


Unique sequence i (SEQ 


First 30 bp of 5 9 end of 

Ch Inmvdnmnnas nv&maea 5 

H 2 gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
atgf cpyg5h2 ] 2 7 3 ' 


93 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas radiata 5 H 2 
gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
atg[crad5h2] 27 y 
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9.4 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas rapa 5 H 2 
gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
dtg[crap5h2 ] 2 7 3' 


9.5 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas sajao 5 H 2 
gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
CLtg[csaj5h2] 27 3 


9.6 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas segnis 222 5 
H 2 gene coding sequence 


5 ' tatgcttgacaatcgtaatcctggtgacaa- 
citg[cseg^5h2] 27 3' 


9.7 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas segnis 1638 5 
H 2 gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
cztg[cseg i0 ™5h2] 27 3' 


9.8 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 

, _ . 1QJQ _ 

Chlamydomonas segnis 5 
H 2 gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 

, r 1919ci Ol 7 o» 

citgl cseg 1 VJ y 5h2] 27 3 


9.9 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas smithii 5 H 2 
gene coding sequence 


5 5 tatgcttgacaatcgtaatcctggtgacaa- 
atg[csmi5h2 ] 27 3 


9.10 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas sphaeroides 
5 H 2 gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
cztg[ csph5h2 ] 27 3 ' 


9.11 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas surtseyiensis 
5 H 2 gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
cUgl csur5h2] 27 3 


9.12 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas ulvaensis 5 
H 2 gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
cztg[ culv5h2 ] 27 3' 


9.13 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas 
zimbabwiensis 5 H 2 gene 
coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
cttgl czim5h2 ] 27 3 ' 


9.14 


Unique sequence i (SEQ 
ID NO 160) 


First 30 bp of 5' end of 
Chlamydomonas reinhardtii 5 
H 2 gene coding sequence 


5' tatgcttgacaatcgtaatcctggtgacaa- 
cttg[ crei5h2 ] 27 3 * 


10.1 


Last 30 bp of 5' end of 
Chlamydomonas 
pulvinata 5 H 2 gene 
coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [cpul5h2] 30 -taacaagaat 
ctggctaatcaatcgatgca 3 s 


10.2 


Last 30 bp of 3' end of 
Chlamydomonas 
pygmaea 5 H 2 gene 
coding sequence 


Unique sequence] (SEQ ID 
NO 161) 


5 ' [cpyg5h2] 27 taa-£aacaagaa£ 
ctggctaatcaatcgatgca 3' 


10.3 


Last 30 bp of 3' end of 
Chlamydomonas radiata 
5 H 2 gene coding 
sequence 


Unique sequence] (SEQ ID 
NO 161) 


5' [crad5h2] 21 \&&-taacaagaat 
ctggctaatcaatcgatgca 3 s 


10.4 


Last 30 bp of 3' end of 
Chlamydomonas rapa 5 
H 2 gene coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [crap5h2] 21 X^~taacaagaat 
ctggctaatcaatcgatgca 3' 


10.5 


Last 30 bp of 3' end of 
Chlamydomonas sajao 5 
H 2 gene coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [csaj5h2] 21 \aa.-taacaagaat 
ctggctaatcaatcgatgca 3' 


10.6 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 222 5 H 2 gene 
coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [cseg 222 5h2] 21 \aartaacaagaat 
ctggctaatcaatcgatgca 3' 


10.7 


Last 30 bp of 3' end of 
Chlamydomonas 
segnis 1638 5 H 2 gene 
coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [cseg lbJ8 5h2] 2 jin<srtaacaagaat 
ctggctaatcaatcgatgca V 
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10.8 


Last 30 bp of 3' end of 
Chlamydomonas 

• 1919 c TT 

segms 5 H 2 gene 
coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [cseg iyi9 5h2] 2 itaa-taacaagaat 
ctggctaatcaatcgatgca 3' 


10.9 


Last 30 bp of 3' end of 
Chlamydomonas smithii 
5 H 2 gene coding 
sequence 


Unique sequence] (SEQ ID 
NO 161) 


5' [csmi5h2] 2 itasi-taacaagaat 
ctggctaatcaatcgatgca 3' 


10.10 


Last 30 bp of 3' end of 
Chlamydomonas 
sphaeroides 5 H 2 gene 
coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [csph5h2] wtttrtaacaagaat 
ctggctaatcaatcgatgca 3' 


10.11 


Last 30 bp of 3' end of 
Chlamydomonas 
surtseyiensis 5 H 2 gene 
coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [csur5h2] 27 taa-taacaagaat 
ctggctaatcaatcgatgca 3' 


10.12 


Last 30 bp of 3' end of 
Chlamydomonas 
ulvaensis 5 H 2 gene 
coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [culv5h2] 21 \.aa-taacaagaat 
ctggctaatcaatcgatgca 3' 


10.13 


Last 30 bp of 3' end of 
Chlamydomonas 
zimbabwiensis 5 H 2 gene 
coding sequence 


Unique sequence j (SEQ ID 
NO 161) 


5' [czim5h2] 27 taa-taacaagaat 
ctggctaatcaatcgatgca 3' 


10.14 


Last 30 bp of 3' end of 
Chlamydomonas 
reinhardtii 5 H 2 gene 
coding sequence 


Unique sequence] (SEQ ID 
NO 161) 


5' [crei5h2] 21 \aa-taacaagaat 
ctggctaatcaatcgatgca 3' 



TABLE 3 



Product 


5' primer 


5' primer 
sequence 


3' primer 


3' primer sequence 


Template 


Nonshuffled 
segment IX 


Unique 
sequence a- 
First 24 
nucleotides 
of promoter 
fragment of 
the lhcbl 
gene 


5' atccgtagtt 
atccttatgg 
ccatcttagc 
gcagttgggtca 
ggggctggcgac y 


Complement 
of unique 
sequence b- 
complement 
of last 25 base 
pairs of the 
promoter 
fragment of 
the lhcbl gene 


5 s tcaggtccagaagctgt 
taatcgatgcacg 
taacgaaatgag 
tctcgcccgcggc3' 


SEQ ID NO 
148 


Nonshuffled 
segment X 


Unique 
sequence c- 
first 25 
nucleotides 
of 3' UTR 
from 

RBCS2 gene 


5' ttaaacgtcg 
tacgtccaag 
tataactaag 
ccgacgtcgaccca 
ctctagaggat 3' 


Complement 
of unique 
sequence d- 
complement 
of last 25 base 
pairs of the 
promoter 
fragment of 
the lhcbl gene 


5' 

ttgtaagatctgaat 
agcatgtatcagatt 
taacgaaatgag 
tctcgcccgcggc 3' 


SEQ ID NO 
151 
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Nonshuffled 


Unique 


5' tcttccatcg 


Complement I 


5' 


SEQ ID NO 


segment XI 


sequence e- 


taaatctagc 


of unique 


cttgaatgcctcgact 


151 




frrst 25 


atcgattagc 


sequence f- 


agattattacagat 






nucleotides 


ccgacgtcgaccca 


complement 


taacgaaatgag 






of 3' UTR 


ctctagaggat 3' ! 


of last 25 base 


tctcgcccgcggc 






from 




pairs of the 


3' 






RBCS2 gene 




promoter 
fragment of 
the Ihcbl gene 






Nonshuffled 


Unique 


5' 


Complement 


5' 


SEQ ID NO 


segment XII 


sequence f- 


atctgtaataatctag 


of unique 


tcacacgattgttaa 


179 




first 25 


tcgaggcattcaag 


sequence g- 


cgatttaagccagtt 






nucleotides 


atggccaagggcga 


complement 


ttacttgtacagctcg 






of synthetic 


ggagctgttca 


of last 25 


tccatgccg 






green 


3' 


nucleotides of 


y 






fluorescent 




synthetic 








protein gene 




green 








(SEQ ID NO 




fluorescent 








32) 




protein gene 






Nonshuffled 


Unique 


5' aactggctta 


Complement 


5' 


SEQ ID NO 


segment XIII 


sequence g- 


aatcgttaac 


of unique 


tcgcacggtaatcgac 


150 




frrst 25 


aatcgtgtga 


sequence h- 


agttatgttaaatc 






nucleotides 


ccgacgtcgaccca 


Complement 


caaatacgcccagcc 






of 3' UTR 


ctctagaggat 3' 


of last 24 


cgcccatgga 






from 




nucleotides of 


y 






RBCS2 gene 




3' UTR from 
RBCS2 gene 







TABLE 4 



Oliso # 


5' end 


3' end corresponding to: 


Sequence 




corresponding to: 




11.1 


Unique sequence b 


First 25 nucleotides of 
Chlamydomonas reinhardtii 
hydrogenase 


5' cgtgcatcgattaacagcttctggacctga 
atgtcggcgctcgtgctgaagccct 3' 


11.2 


Unique sequence b 


First 25 nucleotides of Clostriduim 
pasteuranum hydrogenase 


5' cgtgcatcgattaacagcttctggacctga 
atgaaaacaatacLttataaatggtg 3' 


11.3 


Unique sequence b 


First 25 nucleotides of 
Desulfovibrio vulgaris 
hydrogenase 


5' cgtgcatcgattaacagcttctggacctga 
atgagccgtaccgtcatggagcgca 3' 


11.4 


Unique sequence b 


First 25 nucleotides of Entamoeba 
histolytica hydrogenase 


5' cgtgcatcgattaacagcttctggacctga 
atgccacctaaaccatcacatacac 3' 


11.5 


Unique sequence b 


First 25 nucleotides of 
Scenedesmus obliquus 
hydrogenase 


5' cgtgcatcgattaacagcttctggacctga 
atgcctgagtggcaaccgggaggtc 3' 


11.6 


Unique sequence b 


First 25 nucleotides of Chlorella 
fusca hydrogenase 


5' cgtgcatcgattaacagcttctggacctga 
atgtgttgccccgtggttgcaagta 3 ? 


12.1 


Complement of 
unique sequence c 


Complement of last 25 nucleotides 
of Chlamydomonas reinhardtii 
hydrogenase 


5' cttagttatacttggacgtacgacgtttaa 
tcacttcttctcgtccttctcctcc 3' 


12.2 


Complement of 
unique sequence c 


Complement of last 25 nucleotides 
of Clostriduim pasteuranum 
hydrogenase 


5' cttagttatacttggacgtacgacgtttaa 
ttattttttatatttaciagtgtaat 3' 


12.3 


Complement of 
unique sequence c 


Complement of last 25 nucleotides 
of Desulfovibrio vulgaris 
hydrogenase 


5' cttagttatacttggacgtacgacgtttaa 
ctatgccttgttggcgctcgccatg 3' 


12.4 


Complement of 
unique sequence c 


Complemen t of last 25 nucleotides 
of Entamoeba histolytica 
hydrogenase 


5' cttagttatacttggacgtacgacgtttaa 
ttagttttgatatctgggagtaaaa 3' 
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12.5 


Complement of 
unique sequence c 


Complement of last 25 nucleotides 
of Scenedesmus obliquus 
hydrogenase 


5 ' cttagttatacttggacgtacgacgtttaa 
tcacttctcatcgggcacgccgccg 3' 


12.6 


Complement of 
unique sequence c 


Complement of last 25 nucleotides 
of Chlorella fusca hydrogenase 


5' cttagttatacttggacgtacgacgtttaa 
tcacttctcctctggaattccacct 3 ' 


14 


Unique sequence d 


First 25 nucleotides of 
Chlamydomonas reinhardtii 
ferredoxin 


5 ' aatctgatacatgctattcagatcttacaa 
atggccatggctatgcgctccacct 3' 


15 


Complement of 
unique sequence e 


Complement of last 25 nucleotides 
of Chlamydomonas reinhardtii 
ferredoxin 


5' gctaatcgatgctagatttacgatggaaga 
ttagtacagggcctcctcctggtgg 3' 
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5 344,328; 6,352,842; 6,358,709; 6361,97; 6,368,798; 6,440,668; 6537,776; and 6,605,449. 

Other patents referenced in this application are U.S. Patents 5,871,952, 5,605,79, 5,830,721, 6,165,793, 6,180,406, 
5,939,250, 6,171,820, 6,361,974, 6,358,709, 6,352,842, 6,238,884 
6,420,175, 6,287,861 , 6,277,589 , 4,532,210 and WO 01/48185 (Fischer). 
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WHAT IS CLAIMED IS: 

1 1 . A method for engineering a cell to produce an increased amount of hydrogen comprising: 

2 (a) providing a mutagenized nucleic acid sequence derived from a first gene that encodes a protein 

3 involved in a hydrogen production pathway; 

4 (b) transforming a cell with said mutagenized nucleic acid sequence; and 

5 (c) screening or selecting the cell for an increased amount of hydrogen. 
6 

7 2. The method of claim 1, wherein a plurality of mutagenized nucleic acid sequences are used to transform a 

8 population of cells, followed by the screening or selecting. 
9 

10 3. The method of claim 1, wherein the first gene is selected from the group that encodes ferredoxin, catalase, 

11 isoamylase, malate dehydrogenase, 14-3-3 protein, enolase, aldolase, ribosomal protein S8, ribosomal protein L17, 

12 ribosomal protein S18, ribosomal protein L37, ribosomal protein L12, ribosomal protein S15, iron-hydrogenase, 

13 nickel-iron hydrogenase, and components of the photosystem I, photosystem II, light harvesting antenna and 

14 cytochrome b 6 -f complexes. 
15 

16 4. The method of claim 3, wherein the first gene encodes an iron-hydrogenase. 
17 

18 5. The method of claim 4, wherein at least one amino acid from the segment 

19 X J X 2 X 3 FX 4 X 5 X 6 GGVMEAAX 7 R or the segment ADX 8 TIX 9 EE is substituted by a different amino acid in the 

20 protein encoded by the first gene to generate the mutagenized nucleic acid sequence. 
21 

22 6. The method of claim 5, wherein the mutagenized nucleic acid sequence is generated by gene reassembly. 
23 

24 7. The method of claim 5, wherein the mutagenized nucleic acid sequence is generated by site-directed 

25 mutagenesis. 
26 

27 8. The method of claim 5, wherein an amino acid that is substituted for the at least one amino acid has a side 

28 chain of higher molecular weight than the side chain of the at least one amino acid. 
29 

30 9. The method of claim 5, wherein saturation mutagenesis is performed on the at least one amino acid. 
31 

32 10. The method of claim 5, wherein the mutagenized nucleic acid sequence is generated by a mutagenesis 

33 method described in U.S. Patents selected from the group consisting of 5,537,776; 5,965,408; 6,171,820; 6,174,673; 

34 6,238,884; 6,326,204; 6, 344,328; 6,352,842; 6,358,709; 6361,97; 6,368,798; 6,440,668; 6537,776; and 6,605,449. 
35 

36 11. The method of claim 6, wherein the gene reassembly is performed using nucleic acid molecules that 

37 encode proteins of SEQ ID NOs: 1-112 or segments thereof. 
38 

39 12. The method of claim 4, wherein the mutagenized nucleic acid sequence encodes an iron hydrogenase 

40 protein that functionally interacts with a ferredoxin protein in the cell. 
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41 

42 13. The method of claim 1, wherein the screening or selecting occurs in the presence of oxygen at a 

43 concentration selected from the ranges comprising more than 0.5%, more than 5.0%, more than 10%, more than 

44 15%, approximately 21%, more than 21%, more than 25%, more than 30% or more than 35% oxygen. 
45 

46 14. The method of claim 1, wherein the mutagenized nucleic acid sequence is operably linked to a promoter 

47 that is activated by light. 
48 

49 15. The method of claim 1, wherein the mutagenized nucleic acid sequence is generated by gene reassembly. 
50 

5 1 16. The method of claim 1, wherein the cell is a green algae species. 
52 

53 17. The method of claim 1, wherein cell is of the genus Chlamydomonas. 
54 

55 18. The method of claim 1, further comprising the steps of; 

56 (a) identifying a first independent transformant which produces an increased amount of hydrogen from 

57 step (c) of claim 1 ; 

58 (b) recovering the mutagenized nucleic acid sequence from the independent transformant; 

59 (c) further mutagenizing the recovered mutagenized nucleic acid sequence to create a new library of 

60 mutagenized nucleic acid sequences; 

61 (d) transforming cells with the new library of mutagenized nucleic acid sequences; and 

62 (e) screening or selecting for a new independent transformant from the new library that generates an 

63 increased amount of hydrogen compared to the first independent transformant. 
64 

65 19. The method of claim 18 wherein the mutagenized nucleic acid sequencs are generated by gene reassembly. 
66 

67 20. The method of claim 18, wherein a plurality of mutagenized nucleic acid sequences are recovered from a 

68 plurality of independent transformants which produce an increased amount of hydrogen from step (c) of claim 1, 

69 and wherein the plurality of mutagenized nucleic acid sequences are subjected to gene reassembly to generate the 

70 new library. 
71 

72 21. The method of claim 1, wherein the screening or selecting occurs by culturing cells in liquid growth media. 
73 

74 22. The method of claim 21, wherein the growth media is a photoautotrophic growth-requiring minimal media. 
75 

76 23. The method of claim 1, wherein the screening or selecting occurs in a non-transparent culture container. 
77 

78 24. A method according to claim 1, wherein the mutagenized nucleic acid sequence is operably linked to a 

79 promoter that is constitutively activated. 



80 

81 25. The method of claim 15, wherein the mutagenized nucleic acid sequence is obtained by 

51 



WO 2005/072262 PCT/US2005/001983 

82 subjecting nucleic acid sequences that encode proteins that are expressed when cells are exposed to conditions more 

83 conducive to the generation of hydrogen to gene reassembly, wherein the proteins are naturally encoded by genes in 

84 organisms from more than one species. 
85 

86 26. The method of claim 19, wherein the proteins are iron hydrogenases or nickel-iron hydrogenases. 
87 

88 27. The method of claim 1, further comprising repeating the steps of claim 1 using a second gene distinct from 

89 the first gene. 
90 

91 28. The method of claim 27, further comprising: 

92 (a) mating at least one cell of a strain containing a mutagenized form of the first gene: 

93 i. wherein the at least one cell is identified by the screening or selecting; or 

94 iL wherein the at least one cell is derived through mating from a cell identified by the 

95 screening or selecting; 

96 to at least one cell of a distinct strain containing a mutagenized form of the second gene: 

97 iii. wherein the at least one cell is identified by the screening or selecting; or 

98 iv. wherein the at least one cell is derived through mating from a cell identified by the 

99 screening or selecting; and 

100 (b) screening or selecting for a progeny cell that produces an increased amount of hydrogen 

101 compared to any parent cell. 
102 

103 29. A method of hydrogen production, comprising: 

104 (a) placing cell containing a mutagenized nucleic acid sequence corresponding to a gene that is involved 

105 in a hydrogen production pathway into liquid culture media or on to solid culture media, wherein the 

106 mutagenized nucleic acid sequence is operably linked to a transcriptional promoter sequence; 

107 (b) culturing said transformed cell under conditions sufficient to stimulate transcription of said 

108 mutagenized nucleic acid sequence(s); and 

109 (c) collecting an evolved gas. 
110 

111 30. The method of claim 29, wherein the culture media is photoautotrophic growth requiring media. 
112 

113 31. A method of multiparental mating of microbes that mate in response to a stimulus, comprising: 

114 (a) providing a cell from each of 3 or more strains of microbes capable of mating to each other 

115 in culture medium; 

116 (b) providing the stimulus; 

117 (c) allowing cells to mate and produce progeny; 

118 (d) allowing the progeny cells to achieve sexual reproduction capability; 

1 19 (e) providing the stimulus at least one more time; and 

120 (f) screening or selecting the further progeny for a desired phenotype. 
121 

122 32. The method of claim 31, wherein the microbes52 are g re en algae and the stimulus is the removal of 
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123 nitrogen Vrotn me TtfeaiWtfd' iirutoiftaflowBy Kghtrddmpmsirig &»^ymtS^Ot:t^meen about 0.42-0.52 micrometers, 
124 

1 25 33. The method of claim 32, wherein the green algae are of the Chlamydomonas genus. 
126 

127 34. The method of claim 33, wherein the species is selected from the group comprising reinhardtii, 

1 28 eugametos, incerta, and moewusii. 
129 

130 35. The method of claim 31, wherein the stimulus is interruption of exponential growth in continuous light 

13 1 with a reduction in Light, followed by addition of light. 
132 

1 33 36. The method of claim 35, wherein the reduction in light occurs for a period selected from the group 

134 consisting of at least I, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or more than 12 hours. 
135 

136 37, The method of claim 31, wherein the microbes are of the Scendesmus genus and the stimulus is the 

1 37 addition of chromium to the culture media. 
138 

1 39 38. The method of claim 3 1 , wherein the desired phenotype is hydrogen production. 
140 

141 39. The method of claim 31, wherein nucleic acid exchange occurs between only two parental cells at a time 

1 42 during the mating process. 
143 
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050118 cip sequence Listing 
SEQUENCE LISTING 

<110> Solazyme, inc. 

Dillon, Harrison F. 

<120> Methods and compositions for Evolving Microbial Hydrogen 
Production 

<130> H2042101-CIP 

<140> US 10/763,712 
<141> 2004-01-21 

<150> US 10/287,750 
<151> 2002-11-04 

<150> US 10/411,910 
<151> 2003-04-12 

<150> US 60/500,032 
<151> 2003-09-03 

<160> 184 

<170> Patentln version 3.2 

<210> 1 
<211> 574 
<212> PRT 

<213> Clostriduim pasteuranum 
<400> 1 

Met Lys Thr lie lie lie Asn Gly val Gin Phe Asn Thr Asp Glu Asp 
15 10 15 

Thr Thr lie Leu Lys Phe Ala Arg Asp Asn Asn lie Asp lie Ser Ala 
20 25 30 

Leu cys Phe Leu Asn Asn Cys Asn Asn Asp lie Asn Lys cys Glu lie 
35 40 45 

cys Thr val Glu val Glu Gly Thr Gly Leu val Thr Ala cys Asp Thr 
50 55 60 

Leu lie Glu Asp Gly Met lie lie Asn Thr Asn Ser Asp Ala val Asn 
65 70 75 80 

Glu Lys lie Lys Ser Arg lie Ser Gin Leu Leu Asp lie His Glu Phe 
85 90 95 

Lys Cys Gly Pro Cys Asn Arg Arg Glu Asn cys Glu Phe Leu Lys Leu 
100 ~ 105 ' 110 

Val lie Lys Tyr Lys Ala Arg Ala Ser Lys pro Phe Leu Pro Lys Asp 
115 " 120 125 

Lys Thr Glu Tyr val Asp Glu Arg ser Lys Ser Leu Thr val Asp Arg 
130 135 140 

Thr Lys Cys Leu Leu Cys Gly Arg Cys Val Asn Ala Cys Gly Lys Asn 
145 150 155 160 

Thr Glu Thr Tyr Ala Met Lys Phe Leu Asn Lys Asn Gly Lys Thr lie 
165 170 175 
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050118 CIP Sequence Listing 

lie Gly Ala Glu Asp Glu Lys cys Phe Asp Asp Thr Asn Cys Leu Leu 
180 185 190 

cys Gly Gin Cys lie lie Ala Cys Pro Val Ala Ala Leu Ser Glu Lys 
195 200 205 

ser His Met Asp Arg Val Lys Asn Ala Leu Asn Ala Pro Glu Lys His 
210 215 220 

Val lie Val Ala Met Ala Pro Ser Val Arg Ala Ser lie Gly Glu Leu 
225 230 235 240 

Phe Asn Met Gly Phe Gly val Asp Val Thr Gly Lys lie Tyr Thr Ala 
245 J 250 255 

Leu Arg Gin Leu Gly Phe Asp Lys lie Phe Asp lie Asn Phe Gly Ala 
260 265 270 

Asp Met Thr lie Met Glu Glu Ala Thr Glu Leu Val Gin Arg lie Glu 
275 280 285 

Asn Asn Gly Pro Phe Pro Met Phe Thr Ser Cys Cys Pro Gly Trp Val 
290 295 300 

Arg Gin Ala Glu Asn Tyr Tyr Pro Glu Leu Leu Asn Asn Leu Ser Ser 
305 310 315 320 

Ala Lys ser Pro Gin Gin lie Phe Gly Thr Ala Ser Lys Thr Tyr Tyr 
325 330 335 

Pro Ser lie Ser Gly Leu Asp Pro Lys Asn Val Phe Thr val Thr Val 
340 345 350 

Met Pro Cys Thr Ser Lys Lys Phe Glu Ala Asp Arg Pro Gin Met Glu 
355 360 365 

Lys Asp Gly Leu Arg Asp lie Asp Ala Val lie Thr Thr Arg Glu Leu 
370 "* 375 380 

Ala Lys Met lie Lys Asp Ala Lys lie Pro Phe Ala Lys Leu Glu Asp 
385 390 395 400 

ser Glu Ala Asp Pro Ala Met Gly Glu Tyr ser Gly Ala Gly Ala lie 
405 410 415 

Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala Leu Arg ser Ala Lys 
420 425 430 

Asp Phe Ala Glu Asn Ala Glu Leu Glu Asp lie Glu Tyr Lys Gin val 
435 440 445 

Arg Gly Leu Asn Gly lie Lys Glu Ala Glu Val Glu lie Asn Asn Asn 
450 455 460 

Lys Tyr Asn Val Ala val lie Asn Gly Ala Ser Asn Leu Phe Lys Phe 
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475 ~ 480 

Met Lys ser Gly Met lie Asn Glu Lys Gin Tyr His Phe He Glu Val 
485 490 495 

Met Ala Cys His Gly Gly cys Val Asn Gly Gly Gly Gin Pro His Val 
500 505 510 

Asn Pro Lys Asp Leu Glu Lys Val Asp lie Lys Lys Val Arg Ala Ser 
515 520 525 

Val Leu Tyr Asn Gin Asp Glu His Leu Ser Lys Arg Lys Ser His Glu 
530 535 540 

Asn Thr Ala Leu Val Lys Met Tyr Gin Asn Tyr Phe Gly Lys Pro Gly 
545 550 555 560 

Glu Gly Arg Ala His Glu lie Leu His Phe Lys Tyr Lys Lys 
565 570 

<210> 2 
<211> 421 
<212> PRT 

<213> Desulfovibrio vulgaris 
<400> 2 

Met Ser Arg Thr val Met Glu Arg lie Glu Tyr Glu Met His Thr Pro 
1 5 ~ 10 15 

Asp pro Lys Ala Asp pro Asp Lys Leu His Phe val Gin lie Asp Glu 
20 25 30 

Ala Lys Cys lie Gly Cys Asp Thr Cys Ser Gin Tyr cys Pro Thr Ala 
35 40 45 

Ala lie Phe Gly Glu Met Gly Glu Pro His ser lie Pro His lie Glu 
50 55 60 

Ala Cys lie Asn Cys Gly Gin Cys Leu Thr His Cys Pro Glu Asn Ala 
65 70 75 80 

lie Tyr Glu Ala Gin Ser Trp Val Pro Glu Val Glu Lys Lys Leu Lys 
85 90 95 

Asp Gly Lys val Lys cys lie Ala Met Pro Ala Pro Ala Val Arg Tyr 
100 105 110 

Ala Leu Gly Asp Ala Phe Gly Met Pro val Gly Ser Val Thr Thr Gly 
115 120 125 

Lys Met Leu Ala Ala Leu Gin Lys Leu Gly Phe Ala His cys Trp Asp 
130 135 140 

Thr Glu Phe Thr Ala Asp val Thr lie Trp Glu Glu Gly ser Glu Phe 
145 150 155 160 

Val Glu Arg Leu Thr Lys Lys Ser Asp Met Pro Leu Pro Gin Phe Thr 
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170 175 

Ser cys Cys Pro Gly Trp Gin Lys Tyr Ala Glu Thr Tyr Tyr pro Glu 
180 185 190 

Leu Leu Pro His Phe Ser Thr cys Lys Ser Pro lie Gly Met Asn Gly 
195 200 205 

Ala Leu Ala Lys Thr Tyr Gly Ala Glu Arg Met Lys Tyr Asp pro Lys 
210 215 220 

Gin Val Tyr Thr Val Ser lie Met Pro cys lie Ala Lys Lys Tyr Glu 
225 230 235 240 

Gly Leu Arg Pro Glu Leu Lys Ser Ser Gly Met Arg Asp lie Asp Ala 
245 250 2 55 

Thr Leu Thr Thr Arg Glu Leu Ala Tyr Met lie Lys Lys Ala Gly lie 
260 265 270 

Asp Phe Ala Lys Leu Pro Asp Gly Lys Arg Asp ser Leu Met Gly Glu 
275 280 285 

ser Thr Gly Gly Ala Thr lie Phe Gly val Thr Gly Gly Val Met Glu 
290 295 300 

Ala Ala Leu Arg Phe Ala Tyr Glu Ala val Thr Gly Lys Lys Pro Asp 
305 310 315 320 

ser Trp Asp Phe Lys Ala Val Arg Gly Leu Asp Gly lie Lys Glu Ala 
325 330 335 

Thr val Asn Val Gly Gly Thr Asp val Lys Val Ala Val Val His Gly 
340 345 350 

Ala Lys Arg phe Lys Gin Val Cys Asp Asp Val Lys Ala Gly Lys Ser 
355 360 365 

Pro Tyr His Phe lie Glu Tyr Met Ala Cys Pro Gly Gly Cys val cys 
370 375 380 7 

Gly Gly Gly Gin Pro Val Met Pro Gly val Leu Glu Ala Met Asp Arg 
385 390 395 400 

Thr Thr Thr Arg Leu Tyr Ala Gly Leu Lys Lys Arg Leu Ala Men Ala 
405 410 415 

ser Ala Asn Lys Ala 
420 

<210> 3 
<211> 468 
<212> PRT 

<213> Entamoeba histolytica 
<400> 3 

Met Pro Pro Lys Pro ser His Thr Leu Thr Gly His Asp His Asn His 
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10 15 

ser lie Gin Phe Asp Trp ser Lys Cys Met Gly cys Gly Met cys Ala 
20 25 30 

Thr Lys cys Thr Phe Gly Val Leu Val Lys Gin Pro Pro Lys lie Pro 
35 40 45 

Pro Phe Val Gin Pro Asn Arg Glu Lys Leu Ser Gin Glu Asn Thr Asp 
50 55 60 

Lys Thr Arg val Leu lie Asp Glu Ser Glu Cys Thr Gly cys Gly Gin 
65 70 75 80 

cys ser Leu Val cys Asn Phe Gly Ser lie Thr Pro lie Asp His Leu 
85 90 95 

val Asp Thr Phe Lys Ala Lys Glu Ala Gly Lys Lys Leu val Ala Met 
100 105 - 110 

lie Ala Pro Ser Thr Arg Leu Gly Val Ala Glu Ala Met Gly Met Pro 
115 120 125 

lie Gly Ser Thr Ala Met Ala Gin Leu Val His Cys Leu Arg Leu lie 
130 135 140 

Gly Phe Asp Tyr Val Phe Asp Val Asp Ala Gly Ala Asp Lys Thr Thr 
145 150 155 160 

Met Asp Asp Tyr Ala Glu Val lie Glu Met Lys Lys Glu Gly Lys Gly 
165 170 175 

Pro Ala lie Thr Ser cys Cys Pro Ala Trp lie Glu Leu val Glu Lys 
180 185 190 

Glu Tyr Pro Asp Leu lie Pro Asn Val Ser Thr Ala Arg Ser Pro lie 
195 200 205 

Gly cys Leu Ala Gly cys lie Lys Arg Gly Trp Ala Lys Asp Val Gly 
210 215 220 

lie Ala Val Glu Asp Leu Tyr Thr val Gly lie Met Pro cys lie Ala 
225 230 235 240 

Lys Lys Thr Glu ser Gin Arg Gin Gin lie His Gin Asp Tyr Asp Ala 
245 250 255 

ser Cys Thr ser Asn Glu lie Ala Ala Tyr Phe Lys Lys His Leu Pro 
260 265 270 

Pro Glu Glu Cys Lys Phe Thr Gin Glu Arg Glu Glu Ala Leu Ala Lys 
275 280 285 

Thr Glu Asp Gly Gin Cys Asp Leu Pro Phe Arg Arg lie ser Gly Gly 
290 295 300 
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*SF BSiiBl!b.PHM Jipil&diThr Gly Gly val Cys Glu Thr val Leu Arq 
305 310 315 320 

Val lie Ala Arg Asn Ala Gly val Asp Trp Asn ser cys Thr Val Asn 
325 330 335 

Lys Glu Glu Thr Phe Lys His Ala Ala ser Gly ser Thr Met Thr Asn 
340 345 350 

Leu ser Val Asp lie Gly Gly Thr lie lie Thr Gly Ala Val cys His 
355 360 365 

Gly 370 11 6 AP9 375 Ala CyS GlU LeU 38O GlV GlU 

Leu Lys Val Asp Val Val Glu Met Met Ala Cys Val Gly Gly cys Leu 

390 395 400 

Gly Gly Ala Gly Gin Pro Lys He Pro Pro Ala Lys Lys Leu Glu Met 
405 410 415 

Asp Lys Arg Arg val Met Leu Asp lie Leu Asp Gin Gin Thr Asp lie 
420 425 430 

Arg Ala Ala Asn Glu Asn Thr Asp val Leu Gly Trp lie Asp Lys His 
435 440 445 

Phe Asp His Gin Gly Ala His Gin His Leu His Thr Tyr Phe Thr Pro 
450 455 460 

Arg Tyr Gin Asn 
465 

<210> 4 
<211> 491 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 4 

Met ser Ala Leu Leu ser Glu Ser Asp Leu Asn Asp Phe lie ser Pro 
1 5 10 15 

Ala Leu Ala Cys Val Lys Pro Thr Gin Val ser Gly Gly Lys Lys Asp 
20 25 30 

Asn val Asn Met Asn Gly Glu Tyr Glu Val ser Thr Glu Pro Asp Gin 
35 40 45 

Leu Glu Lys val ser lie Thr Leu ser Asp cys Leu Ala cys ser Gly 
50 55 60 

Cys lie Thr Ser Ser Glu Glu lie Leu Leu ser Ser Gin ser His Ser 
65 70 75 80 

Val Phe Leu Lys Asn Trp Gly Lys Leu Ser Gin Gin Gin Asp Lys Phe 
85 90 95 
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wn ™„™ 050118 CIP sequence Listing 
«u-yaijj Mi ,SeB fell fe'ferf Pro Gin Cys Arg Leu ser Leu Ala Gin Tyr 
" 100 105 110 

Tyr Gly Leu Thr Leu Glu Ala Ala Asp Leu Cys Leu Met Asn Phe Phe 
115 120 125 

Gin Lys His Phe Gin cys Lys Tyr Met val Gly Thr Glu Met Gly Arg 
130 135 140 

lie lie Ser lie ser Lys Thr Val Glu Lys lie lie Ala His Lys Lys 
145 150 155 160 

Gin Lys Glu Asn Thr Gly Ala Asp Arg Lys Pro Leu Leu Ser Ala Val 
165 170 175 

cys Pro Gly Phe Leu lie Tyr Thr Glu Lys Thr Lys Pro Gin Leu val 
180 185 190 

Pro Met Leu Leu Asn val Lys ser Pro Gin Gin lie Thr Gly Ser Leu 
195 200 205 

lie Arg Ala Thr Phe Glu ser Leu Ala lie Ala Arg Glu ser Phe Tyr 
210 215 220 

His Leu ser Leu Met Pro cys Phe Asp Lys Lys Leu Glu Ala Ser Arg 
225 230 235 240 

Pro Glu Ser Leu Asp Asp Gly lie Asp Cys val lie Thr Pro Arg Glu 
245 250 255 

lie val Thr Met Leu Gin Glu Leu Asn Leu Asp Phe Lys ser Phe Leu 
260 265 270 

Thr Glu Asp Thr ser Leu Tyr Gly Arg Leu ser pro Pro Gly Trp Asp 
275 280 ~ 285 

Pro Arg Val His Trp Ala ser Asn Leu Gly Gly Thr Cys Gly Gly Tyr 
290 295 300 

Ala Tyr Gin Tyr Val Thr Ala val Gin Arg Leu His Pro Gly Ser Gin 
305 310 315 320 

Met lie val Leu Glu Gly Arg Asn Ser Asp lie val Glu Tyr Arg Leu 
325 330 335 

Leu His Asp Asp Arg lie lie Ala Ala Ala Ser Glu Leu ser Gly Phe 
340 345 350 

Arg Asn lie Gin Asn Leu val Arg Lys Leu Thr Ser Gly ser Gly ser 
355 360 365 

Glu Arg Lys Arg Asn lie Thr Ala Leu Arg Lys Arg Arg Thr Gly Pro 
370 375 380 

Lys Ala Asn Ser Arg Glu Met Ala Ala Ala Thr Ala Ala Thr Ala Asp 
385 390 395 400 
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050118 CIP Sequence Listing 

f>ro Tyr His Ser Asp Tyr He Glu Val Asn Ala cys Pro Gly Ala cys 
405 410 415 

Met Asn Gly Gly Gly Leu Leu Asn Gly Glu Gin Asn Ser Leu Lys Arg 
420 425 430 

Lys Gin Leu Val Gin Thr Leu Asn Lys Arg His Gly Glu Glu Leu Ala 
435 440 ~ 445 

Met Val Asp Pro Leu Thr Leu Gly Pro Lys Leu Glu Glu Ala Ala Ala 
450 455 460 

Arg Pro Leu ser Leu Glu Tyr val Phe Ala Pro val Lys Gin Ala val 
465 470 475 480 

Glu Lys Asp Leu val Ser Val Gly Ser Thr Trp 
485 490 

<210> 5 
<211> 436 
<212> PRT 

<213> Chi orell a fusca 
<400> 5 

Met cys cys Pro Val Val Ala ser Arg His Ala Gly Arg Ala Arg His 
1 5 10 15 

val Ala val Arg Ala Ala Gly Pro Thr ser Glu Cys Asp cys Pro Pro 
20 25 30 

Thr Pro Gin Ala Lys Leu Pro His Trp Gin Gin Ala Leu Asp Glu Leu 
35 40 45 

Ala Lys Pro Lys Glu ser Arg Arg Leu Met lie Ala Gin lie Ala ser 
50 55 60 

Ala val Arg val Ala lie Ala Glu Thr lie Gly Leu Ala Pro Gly Asp 
65 70 75 80 

val Thr He Gly Gin Leu Val Thr Gly Leu Arg Met Leu Gly Phe Asp 
85 90 95 

Tyr val Phe Asp Thr Leu Phe Gly Ala Asp Leu Thr lie Met Glu Glu 
100 105 110 

Gly Thr Glu Leu Leu His Arg Leu Gin Asp His Leu Glu Gin His Pro 
115 120 125 

Asn Lys Glu Glu Pro Leu Pro Met Phe Thr ser cys cys Pro Gly Trp 
130 135 140 

Val Ala Met Val Glu Lys ser Asn Pro Glu Leu lie Pro Tyr Leu Ser 
145 150 155 160 

ser Cys Lys Ser Pro Gin Met Met Leu Gly Ala val lie Lys Asn Tyr 
165 170 175 
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050118 CIP sequence Listing 

fyr Ala Gin Gin val Gly val Gin Pro Ser Asp lie Cys Asn Val ser 
180 185 190 

Val Met Pro cys Val Arg Lys Gin Gly Glu Ala Asp Arg Glu Trp Phe 
195 200 205 

Asn Thr Thr Gly Ala Gly Leu Ala Arg Asp Val Asp His val val Thr 
210 215 220 

Thr Ala Glu val Gly Lys lie Phe Leu Glu Arg Gly He Lys Leu Asn 
225 230 235 240 

Glu Leu Pro Glu Ser Asn Phe Asp Asn Pro lie Gly Glu Gly Thr Gly 
245 250 255 

Gly Ala Leu Leu Phe Gly Thr Thr Gly Gly Val Met Glu Ala Ala Leu 
260 265 270 

Arg Thr val Tyr Glu Val val Thr Gin Lys Pro Met Gly Arg val Asp 
275 280 285 

Phe Glu Glu val Arg Gly Leu Glu Gly lie Lys Glu Ala Glu lie Thr 
290 " 295 300 

Leu Lys Pro Gly Asp Asp Ser Pro Phe Lys Ala Phe Ala Gly Ala Asp 
305 310 315 320 

Gly Gin Gly lie Thr Leu Lys lie Ala val Ala Asn Gly Leu Gly Asn 
325 330 335 

Ala Lys Lys Leu lie Lys Ser Leu ser Glu Gly Lys Ala Lys Tyr Asp 
340 345 350 

Phe lie Glu Val Met Ala Cys Pro Gly Gly Cys lie Gly Gly Gly Gly 
355 360 365 

Gin Pro Arg Ser Thr Asp Lys Gin lie Leu Gin Lys Arg Gin Gin Ala 
370 375 380 

Met Tyr Asn Leu Asp Glu Arg ser Thr lie Arg Arg Ser His Asp Asn 
385 390 ~ 395 400 

Pro Phe lie Gin Ala Leu Tyr Asp Lys Phe Leu Gly Ala Pro Asn ser 
405 410 415 

His Lys Ala His Asp Leu Leu His Thr His Tyr Val Ala Gly Gly lie 
420 425 430 

Pro Glu Glu Lys 
435 

<210> 6 
<211> 574 
<212> PRT 

<213> Clostridium saccharobutylicum 
<400> 6 

Page 9 



WO 2005/072262 PCT/US2005/001983 

050118 CIP Sequence Listing 

Met fie Asn lie val lie Asp Glu Lys Thr lie Gin Val Gin Glu Asn 
15 10 15 

Thr Thr val lie Gin Ala Ala Leu Ala Asn Gly lie Asp lie Pro ser 
20 25 30 

Leu cys Tyr Leu Asn Glu Cys Gly Asn val Gly Lys Cys Gly val Cys 
35 40 45 

Ala val Glu lie Glu Gly Lys Asn Asn Leu Ala Leu Ala cys lie Thr 
50 55 60 

Lys Val Glu Glu Gly Met val Val Lys Thr Asn ser Glu Lys Val Gin 
65 70 75 80 

Glu Arg val Lys Met Arg Val Ala Thr Leu Leu Asp Lys His Glu Phe 
85 90 95 

Lys cys Gly Pro Cys Pro Arg Arg Glu Asn cys Glu Phe Leu Lys Leu 
100 105 110 

Val lie Lys Thr Lys Ala Lys Ala Asn Lys Pro Phe Val Val Glu Asp 
115 120 125 

Lys ser Gin Tyr lie Asp lie Arg ser Lys ser lie val lie Asp Arg 
130 135 140 

Thr Lys Cys Val Leu cys Gly Arg Cys Glu Ala Ala Cys Lys Thr Lys 
145 150 ~ 155 160 

Thr Gly Thr Gly Ala lie ser lie cys Lys Ser Glu ser Gly Arg lie 
165 170 175 

val Gin Ala Thr Gly Gly Lys Cys Phe Asp Asp Thr Asn cys Leu Leu 
180 185 190 

Cys Gly Gin Cys Val Ala Ala Cys Pro Val Gly Ala Leu Thr Glu Lys 
195 200 205 

Thr His val Asp Arg Val Lys Glu Ala Leu Glu Asp Pro Asn Lys His 
210 215 220 

Val lie Val Ala Met Ala Pro Ser lie Arg Thr Ser Met Gly Glu Leu 
225 230 235 240 

Phe Lys Leu Gly Tyr Gly val Asp Val Thr Gly Lys Leu Tyr Ala ser 
245 250 255 

Met Arg Ala Leu Gly Phe Asp Lys Val phe Asp lie Asn Phe Gly Ala 
260 265 270 

Asp Met Thr lie Met Glu Glu Ala Thr Glu Phe lie Glu Arg Val Lys 
275 280 285 

Asn Asn Gly Pro Phe Pro Met Phe Thr Ser Cys Cys Pro Ala Trp val 
290 295 300 
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050118 CIP Sequence Listing 

Arg Gin Val Glu Asn Tyr Tyr Pro Glu Phe Leu Glu Asn Leu Ser Ser 
305 310 315 320 

Ala Lys Ser Pro Gin Gin lie Phe Gly Ala Ala ser Lys Thr Tyr Tyr 
325 330 335 

Pro Gin lie Ser Gly lie ser Ala Lys Asp Val Phe Thr Val Thr lie 
340 345 350 

Met Pro Cys Thr Ala Lys Lys Phe Glu Ala Asp Arg Glu Glu Met Tyr 
355 360 365 

Asn Glu Gly He Lys Asn lie Asp Ala Val Leu Thr Thr Arg Glu Leu 
370 375 380 

Ala Lys Met lie Lys Asp Ala Lys lie Asn Phe Ala Asn Leu Glu Asp 
385 390 395 400 

Glu Gin Ala Asp Pro Ala Met Gly Glu Tyr Thr Gly Ala Gly Val He 
405 410 415 

Phe Gly Ala Thr Gly Gly Val Met Glu Ala Ala Leu Arg Thr Ala Lys 
420 425 430 

Asp Phe Val Glu Asp Lys Asp Leu Thr Asp lie Glu Tyr Thr Gin lie 
435 440 445 

Arg Gly Leu Gin Gly lie Lys Glu Ala Thr val Glu lie Gly Gly Glu 
450 455 460 

Asn Tyr Asn Val Ala Val lie Asn Gly Ala Ala Asn Leu Ala Glu Phe 
465 470 475 480 

Met Asn Ser Gly Lys lie Leu Glu Lys Asn Tyr His Phe lie Glu Val 
485 490 495 

Met Ala Cys Pro Gly Gly Cys val Asn Gly Gly Gly Gin Pro His Val 
500 505 510 

Ser Ala Lys Glu Arg Glu Lys Val Asp val Arg Thr val Arg Ala Ser 
515 520 525 

Val Leu Tyr Asn Gin Asp Lys Asn Leu Glu Lys Arg Lys ser His Lys 
530 535 540 

Asn Thr Ala Leu Leu Asn Met Tyr Tyr Asp Tyr Met Gly Ala Pro Gly 
545 550 555 560 

Gin Gly Lys Ala His Glu Leu Leu His Leu Lys Tyr Asn Lys 
565 570 

<210> 7 
<211> 421 
<212> PRT 

<213> Desulfovibrio vulgaris 
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050118 CIP Sequence Listing 



ir<4]OQ^l| I 

Hint' IT i' 

Met ser Arg lie Glu Met Glu Lys lie Phe Tyr Glu Asp His Ala Pro 
1 5 10 15 

Asp Pro Lys Ala Asp Pro Asp Lys Leu Phe Phe He Gin lie Asp Glu 
20 25 30 

Ser Lys cys lie Gly Cys Asp Ser cys Gin Gin Tyr Cys Pro Thr Gly 
35 40 45 

Ala lie Phe Gly Asp Thr Gly Asp Ala His Lys lie pro His Glu Glu 
50 55 60 

Leu cys lie Asn cys Gly Gin cys Leu Thr His cys pro Val Gly Ala 
65 70 75 80 

lie Tyr Glu ser Gin Ser Trp Val Thr Glu lie Glu Lys Lys lie Lys 
85 90 95 

Ala Lys Asp val Lys val lie Ala Met Pro Ala Pro Ala val Arq Tvr 
100 105 110 

Ala Leu Gly Asp Ala Phe Gly Leu Pro Val Gly Thr val Thr Thr Gly 
115 120 125 

LyS 130 135 GlU LeU Gly Phe J|o TPP ASP 

Asn Glu Phe Thr Ala Asp Val Thr lie Trp Glu Glu Gly Thr Glu Phe 
145 150 155 160 

Val Gin Arg Leu Thr Lys Lys Leu Asp Lys Pro Leu Pro Gin Phe Thr 
165 170 175 

Ser cys cys Pro Gly Trp His Lys Tyr Val Glu ser Leu Tyr Pro Glu 
180 185 190 

Leu Phe Pro His Met ser Ser cys Lys Ser Pro lie Gly Met Leu Gly 
195 200 205 

Thr Leu Ala Lys Thr Tyr Gly Ala Asp Arg Met Lys Tyr Asp Arg Ala 
210 215 220 

Lys val Tyr Thr Val Ser lie Met pro cys Thr Ala Lys Lys Tyr Glu 
225 230 235 240 

Gly Met Arg pro Gin Leu Trp Asp ser Gly His Lys Asp lie Asp Ala 
245 250 255 

Thr lie Asp Thr Arg Glu Leu Ala Tyr Met lie Lys Lys Ala Lys lie 
260 265 270 

Asp Phe Thr Lys Leu Pro Asp Gly Lys Arg Asp Thr Leu Met Gly Glu 
275 280 285 

Ser Thr Gly Gly Ala Thr Leu Phe Gly Val Thr Gly Gly Val Met Glu 
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050118 CIP Sequence Listing 

'UWB' 295 300 

Ala Ala Leu Arg Tyr Ala Tyr Gin Ala val Thr Gly Lys Lys Pro Glu 
305 310 315 320 

ser Met Asp Phe Lys Gly val Arg Gly Leu Gin Gly val Lys Glu Ala 
325 330 335 

Thr Val Asn val Gly Gly val Asp val Lys Val Ala val val His Gly 
340 345 350 

Ala Arg Arg Phe His Asp Val cys Glu Leu val Lys Ala Gly Lys Ala 
355 360 365 

Pro Trp His Phe lie Glu Phe Met Ala cys Pro Gly Gly cys Val cys 
370 375 380 

Gly Gly Gly Gin Pro Val Met Pro Gly val Leu Glu Ala Ala Asp Arg 
385 390 395 400 

Arg ser Thr Arg Met Tyr Ala Gly Leu Lys Lys Arg Leu Ala Met Ala 
405 410 ~ 415 

Ser Ala ser Arg Ala 
420 

<210> 8 
<211> 124 
<212> PRT 

<213> Desulfovibrio vulgaris 
<400> 8 

Met Gin lie val Asn Leu Thr Arg Arg Gly Phe Leu Lys Ala Ala cys 
1 5 10 15 

Val val Thr Gly Gly Ala Leu lie Ser lie Arg Met Thr Gly Lys Ala 
20 25 " 30 

val Ala Ala Ala Lys Gin Leu Lys Asp Tyr Met Met Asp Arg lie Asn 
35 40 45 

Gly Val Tyr Gly Ala Asp Ala Lys Phe Pro Val Arg Ala Ser Gin Asp 
50 55 60 

Asn Val Gin Val Gin Lys Leu Tyr Ala Asp Phe Leu Glu Lys Pro Met 
65 70 75 80 

Ser His Lys Ala Glu Gin Leu Leu His Thr His Trp Val Asp Arg Ser 
85 90 95 

Lys Ala lie Glu Arg Met Lys Ala Gin Gly Ala Tyr Pro Asn Pro Arg 
100 105 110 

Ala Lys Glu Phe Glu Gly Asn Thr Tyr Pro Tyr Glu 
115 120 

<210> 9 
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.. . .J SMS 

<212> PRT 

<213> Desulfovibrio vulgaris 
<400> 9 

Met Asn Ala Phe lie Asn Gly Lys Glu Val Arg cys Glu Pro Gly Arq 
1 5 10 15 

Thr lie Leu Glu Ala Ala Arg Glu Asn Gly His Phe lie Pro Thr Leu 
20 25 30 

Cys Glu Leu Ala Asp lie Gly His Ala Pro Gly Thr Cys Arq Val Cys 
35 40 45 

Leu val Glu lie Trp Arg Asp Lys Glu Ala Gly Pro Gin He val Thr 
50 55 60 

Ser Cys Thr Thr Pro val Glu Glu Gly Met Arg lie Phe Thr Arq Thr 
65 70 75 80 

Pro Glu val Arg Arg Met Gin Arg Leu Gin val Glu Leu Leu Leu Ala 
85 90 95 

Asp His Asp His Asp Cys Ala Ala Cys Ala Arg His Gly Asp Cys Glu 
100 105 110 

Leu Gin Asp Val Ala Gin Phe Val Gly Leu Thr Gly Thr Arg His His 
115 120 125 

Phe Pro Asp Tyr Ala Arg ser Arg Thr Arg Asp val ser Ser Pro Ser 
130 135 140 

val val Arg Asp Met Gly Lys cys lie Arg Cys Leu Arg cys Val Ala 
145 150 155 160 

Val cys Arg Asn val Gin Gly val Asp Ala Leu val Val Thr Gly Asn 
165 170 175 

Gly lie Gly Thr Glu lie Gly Leu Arg His Asn Arg ser Gin Ser Ala 
180 185 190 

Ser Asp cys val Gly cys Gly Gin cys Thr Leu Val Cys Pro Val Gly 
195 200 205 

Ala Leu Ala Gly Arg Asp Asp val Glu Arg val lie Asp Tyr Leu Tyr 
210 215 220 

Asp pro Glu lie val Thr val Phe Gin Phe Ala Pro Ala val Arq val 
225 230 235 240 

Gly Leu Gly Glu Glu Phe Gly Leu Pro Pro Gly Ser Ser Val Glu Gly 
245 250 255 

Gin Val Pro Thr Ala Leu Arg Leu Leu Gly Ala Asp val Val Leu Asp 
260 265 270 

Thr Asn Phe Ala Ala Asp Leu Val lie Met Glu Glu Gly Thr Glu Leu 
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050118 CIP Sequence Listing 
280 285 

Leu Gin Arg Leu Arg Gly Gly Ala Lys Leu Pro Leu Phe Thr ser Cys 
290 295 300 

Cys Pro Gly Trp Val Asn Phe Ala Glu Lys His Leu Pro Asp lie Leu 
305 310 315 320 

Pro His val ser Thr Thr Arg ser Pro Gin Gin cys Leu Gly Ala Leu 
325 330 335 

Ala Lys Thr Tyr Leu Ala Arg Thr Met Asn Val Ala Pro Glu Arg Met 
340 345 350 

Arg val val ser Leu Met Pro Cys Thr Ala Lys Lys Glu Glu Ala Ala 
355 360 365 

Arg Pro Glu Phe Arg Arg Asp Gly val Arg Asp Val Asp Ala Val Leu 
370 375 ~ 380 

Thr Thr Arg Glu Phe Ala Arg Leu Leu Arg Arg Glu Gly lie Asp Leu 
385 390 395 400 

Ala Gly Leu Glu Pro Ser Pro Cys Asp Asp Pro Leu Met Gly Arq Ala 
405 410 415 

Thr Gly Ala Ala Val lie Phe Gly Thr Thr Gly Gly val Met Glu Ala 
420 425 430 

Ala Leu Arg Thr val Tyr His val Leu Asn Gly Lys Glu Leu Ala Pro 
435 440 445 

val Glu Leu His Ala Leu Arg Gly Tyr Glu Asn Val Arg Glu Ala val 
450 455 460 

val Pro Leu Gly Glu Gly Asn Gly Ser val Lys Val Ala Val Val His 
465 470 475 480 

Gly Leu Lys Ala Ala Arg Gin Met Val Glu Ala val Leu Ala Gly Lys 
485 490 495 

Ala Asp His Val Phe Val Glu val Met Ala cys Pro Gly Gly cys Met 
500 505 510 

Asp Gly Gly Gly Gin Pro Arg ser Lys Arg Ala Tyr Asn Pro Asn Ala 
515 520 525 

Gin Ala Arg Arg Ala Ala Leu Phe ser Leu Asp Ala Glu Asn Ala Leu 
530 535 540 

Arg Gin Ser His Asn Asn Pro Leu lie Gly Lys Val Tyr Glu Ser Phe 
545 550 555 560 

Leu Gly Glu Pro Cys Ser Asn Leu Ser His Arg Leu Leu His Thr Arg 
565 570 " 575 
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fr/GlgeBAtiJ#i«SiaGlu val Ala Tyr Thr Met Arg Asp He Trp 
580 585 590 

His Glu Met Thr Leu Gly Arg Arg Val Arg Gly Asp ser Asp 
595 600 605 

<210> 10 

<211> 572 

<212> PRT 

<213> Clostridium perfringens 

<400> 10 

Met Asn Lys lie lie lie Asn Asp Lys Thr lie Glu Phe Asp Gly Asp 
1 5 10 15 

Lys Thr lie Leu Asp Leu Ala Arg Glu Asn Gly Phe Asp lie Pro Val 
20 25 30 

Leu cys Glu Leu Lys Asn Cys Gly Asn Lys Gly Gin cys Gly val Cys 
35 40 ' 45 

Leu Val Glu Gin Glu Gly Asn Asp Arg Leu Leu Arg ser cys Ala lie 
50 55 60 

Lys Ala Lys Asp Gly Met val lie Lys Thr Asp ser Glu Lys val Leu 
65 70 75 80 

Glu Ala Arg Lys Glu Arg val Ala Glu Leu Leu Asp Glu His Glu Phe 
85 90 95 

Lys cys Gly Pro Cys Lys Arg Arg Glu Asn Cys Glu Phe Leu Lys Leu 
100 105 110 

val lie Lys Thr Lys Ala Arg Ala His Lys Pro Phe Val val Ala Asp 
115 120 125 

Lys Ser Glu Tyr Val Asp Asp Arg ser Lys ser lie Val Leu Asp Arq 
130 135 140 

ser Lys cys val Lys cys Gly Arg cys val Ala Ala cys Arg Thr Arg 
145 150 155 160 

Thr Ala Thr Asn ser lie Lys Phe His Arg lie Asp Gly val Arg Leu 
165 170 175 

Val Gly Pro Glu Glu Leu Lys Cys Phe Asp Asp Thr Asn cys Leu Leu 
180 185 190 

Cys Gly Gin cys lie Ala Ala cys Pro Val Asp Ala Leu ser Glu Lys 
195 200 205 

Ser His lie Glu Arg Val Gin Glu Ala Leu Asn Asp Pro Glu Lys His 
210 215 220 

Val lie Val Ala Met Ala Pro Ala val Arg Thr Ser Met Gly Glu Leu 
225 230 235 240 
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:Pte^&mka*ft $J4.MS$y$Z] n Asp val Thr Gly Lys Leu Tyr Thr Ala 
245 250 255 

Leu Arg Glu Leu Gly Phe Asp Lys Val Phe Asp lie Asn Phe Gly Ala 
260 265 270 

Asp Met Thr He Met Glu Glu Ala Thr Glu Leu lie Glu Arq lie Lys 
275 280 285 

Asn Asn Gly Pro Phe Pro Met Leu Thr Ser Cys cys Pro ser Trp Val 
290 295 300 

Arg Glu Val Glu Asn Tyr Phe Pro Glu Leu Val Glu Asn Leu Ser Ser 
305 310 315 320 

Ala Lys Ser Pro Gin Gin lie Phe Gly Ala Ala Ser Lys Thr Tyr Tyr 
325 330 335 

Pro Gin val Ala Asp He Asp Pro Lys Lys Val Phe Thr Val Thr val 
340 345 350 

Met Pro cys Thr Ser Lys Lys Phe Glu Ala Asp Arg pro Glu Met Glu 
355 360 365 

Asn Glu Gly lie Arg Asn He Asp Ala Val lie Thr Thr Arg Glu Leu 
370 375 380 

Ala Arg Met lie Lys Ala Ala Lys lie Asp Phe Ala Lys Leu Glu Asp 
385 390 395 400 

Gly Glu val Asp Pro Ala Met Gly Glu Tyr Thr Gly Ala Gly val lie 
405 410 415 

Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala Leu Arg Thr Ala Lys 
420 425 430 

Asp Phe Met Glu Asn Asp Asn Leu Asp Asn val Asp Tyr Glu Ala val 
435 440 445 

Arg 91^ Leu Ala Gl y Ile L V S Glu Ala G_lu v al Glu He Ala Gly Asn 
450 455 460 

Glu Tyr Lys Leu Ala val val ser Gly Ala Ala Asn Val Phe Glu Leu 
465 470 475 480 

val Lys ser Gly Lys lie Asn Asp Tyr His Phe lie Glu val Met Ala 
485 490 495 

Cys Pro Gly Gly cys val Asn Gly Gly Gly Gin Pro His He Ser Ala 
500 505 510 

Glu Asp ser Asp Lys Met Asp lie Arg Glu Val Arg Ala ser val Leu 
515 520 525 

Tyr Asn Gin Asp Lys Asn Leu Glu Lys Arg Lys ser His Gin Asn ser 
530 535 540 
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Ata Leu Leu Lys Met Tyr Glu ser Tyr Met Gly Lys Pro Gly His Gly 
545 550 555 560 

Arg Ala His Glu Leu Leu His Met Lys Tyr Lys Lys 
565 570 

<210> 11 
<211> 572 
<212> PRT 

<213> Clostridium perfringens 
<400> 11 

Met Asn Lys lie lie lie Asn Asp Lys Thr lie Glu Phe Asp Gly Asp 
1 5 10 15 

Lys Thr lie Leu Asp Leu Ala Arg Glu Asn Gly Phe Asp lie Pro Val 
20 25 30 

Leu Cys Glu Leu Lys Asn cys Gly Asn Lys Gly Gin Cys Gly Val cys 
35 40 45 

Leu Val Glu Gin Glu Gly Asn Asp Arg Leu Leu Arg ser Cys Ala lie 
50 55 60 

Lys Ala Lys Asp Gly Met Val lie Lys Thr Asp ser Glu Lys Val Leu 
65 70 75 80 

Glu Ala Arg Lys Glu Arg Val Ala Glu Leu Leu Asp Glu His Glu Phe 
85 90 95 

Lys cys Gly Pro cys Lys Arg Arg Glu Asn Cys Glu Phe Leu Lys Leu 
100 105 110 

Val lie Lys Thr Lys Ala Arg Ala His Lys Pro Phe Val Val Ala Asp 
115 120 125 

Lys Ser Glu Tyr val Asp Asp Arg ser Lys Ser lie val Leu Asp Arg 
130 135 ^ 140 

Ser Lys Cys val Lys Cys Gly Arg cys val Ala Ala cys Arg Thr Arg 
145 150 155 160 

Thr Ala Thr Asn Ser lie Lys Phe His Arg lie Asp Gly Val Arq Leu 
165 170 175 

val Gly Pro Glu Glu Leu Lys Cys Phe Asp Asp Thr Asn cys Leu Leu 
180 185 190 

cys Gly Gin cys lie Ala Ala Cys Pro Val Asp Ala Leu ser Glu Lys 
195 200 205 

Ser His lie Glu Arg Val Gin Asp Ala Leu Asn Asp Pro Glu Lys His 
210 215 220 

val lie val Ala Met Ala Pro Ala Val Arg Thr ser Met Gly Glu Leu 
225 230 235 240 
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p'he Lys Met Gly Tyr Gly Gin Asp val Thr Gly Lys Leu Tyr Thr Ala 
245 250 255 

Leu Arg Glu Leu Gly Phe Asp Lys Val Phe Asp lie Asn Phe Gly Ala 
260 265 270 

Asp Met Thr lie Met Glu Glu Ala Thr Glu Leu lie Glu Arg lie Lys 
275 280 285 

Asn Asn Gly Pro Phe Pro Met Leu Thr ser cys cys Pro ser Trp val 
290 295 300 

Arg Glu val Glu Asn Tyr Phe Pro Glu Leu val Glu Asn Leu ser ser 
305 310 315 320 

Ala Lys ser Pro Gin Gin He Phe Gly Ala Ala Ser Lys Thr Tyr Tyr 
325 330 335 

Pro Gin val Ala Asp lie Asp Pro Lys Lys val Phe Thr val Thr val 
340 345 350 

Met Pro cys Thr Ser Lys Lys Phe Glu Ala Asp Arg Pro Glu Met Glu 
355 360 365 

Asn Glu Gly lie Arg Asn lie Asp Ala val lie Thr Thr Arg Glu Leu 
370 375 380 

Ala Arg Met lie Lys Ala Ala Lys lie Asp Phe Ala Lys Leu Glu Asp 
385 390 395 400 

Gly Glu val Asp Pro Ala Met Gly Glu Tyr Thr Gly Ala Gly val lie 
405 410 415 

Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala Leu Arg Thr Ala Lys 
420 425 430 

Asp Phe Met Glu Asn Asp Asn Leu Asp Asn val Asp Tyr Glu Ala val 
435 440 445 

Arg Gly Leu Ala Gly lie Lys Glu Ala Glu Val Glu He Ala Gly Asn 
450 455 460 

Glu Tyr Lys Leu Ala Val val Ser Gly Ala Ala Asn val Phe Glu Leu 
465 470 475 480 

Val Lys ser Gly Lys lie Asn Asp Tyr His Phe lie Glu val Met Ala 
485 490 495 

Cys Pro Gly Gly cys Val Asn Gly Gly Gly Gin Pro His lie ser Ala 
500 505 510 

Glu Asp ser Asp Lys lie Asp lie Arg Glu Val Arg Ala ser val Leu 
515 520 " 525 

Tyr Asn Gin Asp Lys Asn Leu Glu Lys Arg Lys ser His Gin Asn ser 
530 535 540 
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Ala Leu Leu Lys Met Tyr Glu Asn Tyr Met Gly Lys Pro Gly His Gly 
545 550 555 560 

Arg Ala His Glu Leu Leu His Met Lys Tyr Lys Lys 
565 570 

<210> 12 
<211> 484 
<212> PRT 

<213> Megasphaera elsdenii 
<400> 12 

Met Pro Glu Phe His Ser Arg Phe Glu Lys lie Asp Arg Arg val Pro 
15 10 15 

lie Asp Glu His Asn Cys Ala Val Gin Phe Asp Val Thr Lys Cys Lys 
20 25 30 

Asn cys Thr Leu cys Arg Arg Ala Cys Ala Asp Thr Gin Thr Val Leu 
35 40 45 

Asp Tyr Tyr Ser Leu ser ser Thr Gly Asp Met Pro lie Cys Val His 
50 55 60 

cys Gly Gin Cys Ser Ser Ala Cys Pro Phe Gly Ala lie Val Glu Val 
65 70 75 80 

Asn Asp Val Asp Lys Val Lys Ala Ala Leu Lys Asp Pro Glu Lys lie 
85 90 95 

Val lie Phe Gin Thr Ala Pro Ala val Arg val Gly Leu Gly Glu Ala 
100 105 110 

Phe Gly Met Asp Pro Gly Thr Phe Val Glu Gly Lys Met val Ala Ala 
115 120 125 

Leu Arg Thr Leu Gly Ala Asp Tyr Val Phe Asp Thr Asp Phe Gly Ala 
130 135 140 

Asp Leu Thr lie Met Glu Glu Ala Thr Glu Leu Leu His Arg Leu Gin 
145 150 155 160 

Ser Glu Glu lie Pro lie Pro Gin Phe Thr Ser cys Cys Pro Ala Trp 
165 170 175 

Val Glu Phe Ala Glu Thr Phe Tyr Pro Asp Leu Leu Gin His Leu ser 
180 185 190 

Ser Thr Lys ser Pro lie Ser lie Leu ser Pro Val lie Lys Thr Tyr 
195 200 205 

Phe Ala Gin Gin Lys Asn lie Asp Pro Lys Lys lie Val Asn Val cys 
210 215 220 

Val Thr Pro Cys Thr Ala Lys Lys Ala Glu He Arg Arg Pro Glu Leu 
225 230 235 240 
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ser Ala ser Gly Leu Phe Trp Asp Glu Pro Glu lie Arg Asp Thr Asp 
245 250 255 

He cys He Thr Thr Arg Glu Leu Ala Gin Trp lie Gin Asp Glu Asn 
260 265 270 

lie Asp Phe Ala ser Leu Glu Asp ser Lys Phe Asp Lys Ala Phe Gly 
275 280 285 

Glu Ala ser Gly Gly Gly Arg lie Phe Gly Asn ser Gly Gly Val Met 
290 295 300 

Glu Ala Ala lie Arg Thr Ala Tyr His Met Phe Thr Gly Arg Pro Ala 
305 310 315 320 

Pro Lys Asp Phe lie Pro Phe Glu Pro Val Arg Gly Leu Gin Gly val 
325 330 335 

Lys Lys Ala Thr val lie Phe Gly His Phe val Leu His val Ala Ala 
340 345 350 

lie ser Gly Leu Gly Asn Ala Arg Ala Phe lie Asp Asp Leu lie Lys 
355 360 365 

Asn Asp Ala Phe Glu Asp Tyr ser Phe lie Glu val Met Ala cys Pro 
370 375 380 

Gly Gly Cys lie Gly Gly Gly Gly Gin Pro Lys val Lys Leu Pro Gin 
385 390 395 400 

val Lys Lys val Gin Glu Ala Arg Thr Ala ser lie Tyr Lys ser Asp 
405 410 415 

Glu Glu Thr Asp lie Lys Ala ser Trp Gin Asn Pro Glu lie Glu Thr 
420 425 430 

Leu Tyr Glu Ala Phe Leu Asp Glu Pro Leu Ser Glu Met Ala Glu Phe 
435 440 445 

Thr Leu His Thr Tyr Phe ser Asp Lys ser Asp Gin Leu Gly Arg Met 
450 455 460 

Lys Asn Leu Thr Pro Gin Thr Asn Pro Met Ser Pro Lys Tyr Lys Pro 
465 470 475 480 

Pro Thr Glu Glu 



<210> 13 
<211> 421 
<212> PRT 

<213> Desulfovibrio desulfuricans strain 
<400> 13 

Met Asn Leu Val Glu Met Glu Lys lie Gin Tyr Val Asp Gin ser Pro 
1 5 10 15 
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Asp Pro Arg Ala Asn Pro Asp Glu Leu Phe Phe lie Gin lie Asp Pro 
20 25 30 

Glu Lys cys lie Gly Cys Asp Thr Cys Gin Glu Tyr cys Pro Thr Gly 
35 40 45 

Ala lie Phe Gly Asp Thr Gly ser Ala His Ser He Pro His Glu Glu 
50 55 60 

lie Cys lie Asn Cys Gly Gin Cys Leu Thr His Cys Pro Val Gly Ala 
65 70 75 80 

He Tyr Glu val Gin Ser Trp val Arg Glu Leu Ser Glu Lys lie Lys 
85 90 95 

Asp Pro Glu lie Lys Val lie Ala Met Pro Ala Pro Ala val Arg Tyr 
100 105 110 

Gly Leu Gly Glu Cys Phe Gly Met Pro Val Gly Thr Val Thr Thr Gly 
115 120 125 

Lys Met Leu Thr Ala Leu Gin Met Leu Gly Phe Asp His Val Trp Asp 
130 135 140 

Asn Glu Phe Thr Ala Asp val Thr lie Trp Glu Glu Gly Thr Glu Phe 
' 145 150 155 160 

val Asn Arg Leu Thr Gly Gin lie Asp Lys Pro Leu Pro Gin Phe Thr 
165 170 175 

Ser Cys Cys Pro Gly Trp His Lys Tyr val Glu Ser Phe Tyr Pro Glu 
180 185 190 

Leu Phe Pro His Leu ser Ser cys Lys Ser Pro lie Gly Met Met Gly 
195 200 205 

Ala Leu Ala Lys Thr Tyr Gly Pro Asp val Met Lys Tyr Asp Arg Ser 
210 215 220 

Lys val Tyr Thr val Ser lie Met Pro Cys Thr Ala Lys Lys Tyr Glu 
225 230 235 240 

Gly Met Arg Ala Asp Leu Trp ser Ser Gly Tyr Lys Asp lie Asp Ala 
245 250 255 

Thr lie Asp Thr Arg Glu Leu Ala Tyr Met lie Lys Lys Ala Gly lie 
260 265 270 

Asp Phe Ala Ala Leu Pro Asp Gly Lys Arg Asp Thr Leu Met Gly Asp 
275 280 285 

Ser Thr Gly Gly Ala Thr lie Phe Gly Val Ser Gly Gly val Met Glu 
290 295 300 

Ala Ala Leu Arg Tyr Ala Tyr Glu Ala val Thr Gly Lys Lys Pro Ser 
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315 320 

Ser Trp Asp Phe Thr Met Val Arg Gly Leu Asn Gly He Lys Glu Gly 
325 330 335 

Thr Val Thr lie Gly Asp Ala Lys lie Asn val Ala val val His Gly 
340 345 350 

Ala Lys Arg Phe Ala Glu Val cys Glu val lie Lys Thr Gly Lys Ser 
355 360 365 

Pro Trp His Phe lie Glu Phe Met Ala Cys Pro Gly Gly cys val cys 
370 375 380 

Gly Gly Gly Gin Pro Val Met Pro Gly val Leu Glu Ala Met Asp Arq 
385 390 395 400 

Lys Val ser Arg Thr Phe Ala Gly Leu Lys Glu Arg Leu Asn Arg Met 
405 410 415 

Ser Ser ser Lys Ala 
420 

<210> 14 
<211> 585 
<212> PRT 

<213> Desulfovibrio f ructosovorans 
<400> 14 

Met Ser Met Leu Thr lie Thr lie Asp Gly Lys Thr Thr Ser val Pro 
15 10 15 

Glu Gly ser Thr lie Leu Asp Ala Ala Lys Thr Leu Asp lie Asp lie 
20 25 30 

Pro Thr Leu cys Tyr Leu Asn Leu Glu Ala Leu Ser lie Asn Asn Lys 
35 40 45 

Ala Ala Ser cys Arg val cys val val Glu Val Glu Gly Arg Arg Asn 
50 55 60 

Leu Ala Pro Ser Cys Ala Thr Pro val Thr Asp Asn Met Val val Lys 
65 70 75 80 

Thr Asn Ser Leu Arg Val Leu Asn Ala Arg Arg Thr val Leu Glu Leu 
85 90 95 

Leu Leu Ser Asp His pro Lys Asp cys Leu val cys Ala Lys ser Gly 
100 105 110 

Glu cys Glu Leu Gin Thr Leu Ala Glu Arg Phe Gly lie Arg Glu ser 
115 120 125 

Pro Tyr Asp Gly Gly Glu Met Ser His Tyr Arg Lys Asp lie ser Ala 
130 135 140 

Ser lie lie Arg Asp Met Asp Lys Cys lie Met cys Arg Arg cys Glu 
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Thr Met Cys Asn Thr Val Gin Thr Cys Gly Val Leu Ser Gly Val Asn 
165 170 175 

Arg Gly Phe Thr Ala Val val Ala Pro Ala Phe Glu Met Asn Leu Ala 
180 185 190 

Asp Thr Val Cys Thr Asn cys Gly Gin cys Val Ala Val cys Pro Thr 
195 200 205 

Gly Ala Leu Val Glu His Glu Tyr lie Trp Glu val val Glu Ala Leu 
210 215 220 

Ala Asn Pro Asp Lys Val val lie val Gin Thr Ala Pro Ala val Arg 
225 230 235 240 

Ala Ala Leu Gly Glu Asp Leu Gly Val Ala Pro Gly Thr Ser Val Thr 
245 250 255 

Gly Lys Met Ala Ala Ala Leu Arg Arg Leu Gly Phe Asp His val Phe 
260 265 270 

Asp Thr Asp Phe Ala Ala Asp Leu Thr lie Met Glu Glu Gly ser Glu 
275 280 285 

Phe Leu Asp Arg Leu Gly Lys His Leu Ala Gly Asp Thr Asn val Lys 
290 295 300 

Leu Pro lie Leu Thr Ser Cys cys Pro Gly Trp val Lys Phe Phe Glu 
305 310 315 320 

His Gin Phe Pro Asp Met Leu Asp val Pro ser Thr Ala Lys Ser Pro 
325 330 335 

Gin Gin Met Phe Gly Ala lie Ala Lys Thr Tyr Tyr Ala Asp Leu Leu 
340 345 350 

Gly lie Pro Arg Glu Lys Leu val val Val ser val Met Pro cys Leu 
355 360 365 

Ala Lys Lys Tyr Glu Cys Ala Arg Pro Glu Phe ser Val Asn Gly Asn 
370 375 380 

Pro Asp Val Asp lie Val lie Thr Thr Arg Glu Leu Ala Lys Leu Val 
385 390 395 400 

Lys Arg Met Asn lie Asp Phe Ala Gly Leu Pro Asp Glu Asp Phe Asp 
405 410 415 

Ala Pro Leu Gly Ala ser Thr Gly Ala Ala Pro lie Phe Gly val Thr 
420 425 430 

Gly Gly Val lie Glu Ala Ala Leu Arg Thr Ala Tyr Glu Leu Ala Thr 
435 440 445 
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450 455 460 

Gly val Lys Lys Ala Lys val Lys Val Gly Asp Asn Glu Leu val lie 
465 470 475 480 

Gly val Ala His Gly Leu Gly Asn Ala Arg Glu Leu Leu Lys Pro cys 
485 490 495 

Gly Ala Gly Glu Thr Phe His Ala lie Glu val Met Ala cys Pro Gly 
500 505 510 

Gly Cys He Gly Gly Gly Gly Gin Pro Tyr His His Gly Asp val Glu 
515 520 525 

Leu Leu Lys Lys Arg Thr Gin Val Leu Tyr Ala Glu Asp Ala Gly Lys 
530 535 540 

Pro Leu Arg Lys Ser His Glu Asn Pro Tyr lie lie Glu Leu Tyr Glu 
545 550 555 560 

Lys Phe Leu Gly Lys Pro Leu ser Glu Arg ser His Gin Leu Leu His 
565 570 575 

Thr His Tyr Phe Lys Arg Gin Arg Leu 
580 U 585 

<210> 15 
<211> 421 
<212> PRT 

<213> Desulfovibrio f ructosovorans 
<400> 15 

Met ser Arg lie Glu Met Ala Lys lie Phe Tyr Glu Gin Thr val Pro 
15 10 15 

Pro Pro Gly Thr Asn Leu Asp Gin Ala Tyr lie Val Gin Val Asp Glu 
20 25 30 

Thr Lys Cys lie Gly Cys Asp Thr Cys Met Gly Tyr Cys Pro Thr Gly 
35 40 45 

Ala lie Thr Gly Glu ser Gly Glu Pro His Lys val val Asp Pro Ala 
50 5 5 60 

Ala Cys lie Asn Cys Gly Gin Cys Leu Thr His cys Pro Val Ala Ala 
65 70 75 80 

lie Tyr Glu Thr val ser Phe val Pro Glu lie Glu Ala Lys Leu Lys 
85 90 95 

Asp Lys Asn Val Lys val He Ala Met Pro Ala Pro Ala val Arq Tyr 
100 105 110 

Ala Leu Gly Asp Pro Phe Gly Met Pro Leu Gly Ala val Thr Thr Glu 
115 120 125 
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FEHTv't^ Gin Leu Gly Phe Asp Asn Val Trp Asp 

130 135 140 

Asn Glu Phe Thr Ala Asp Val Thr lie Trp Glu Glu Gly Ser Glu Leu 
145 150 155 160 

Leu Ala Arg lie Thr Lys Lys Leu Asp Lys Pro Leu Pro Gin Phe Thr 
165 170 175 

Ser cys cys Pro Gly Trp Gin Lys Tyr Ala Glu Thr Phe Tyr Pro Glu 
180 185 190 

Leu Leu Pro His Phe Ser ser cys Lys Ser pro lie Gly Met Met Gly 
195 200 205 

Pro Leu Ala Lys Thr Tyr Gly Ala Lys Glu Leu Gly Tyr Glu Pro Lys 
210 215 220 

Gin lie Tyr Thr val Ser lie Met Pro Cys Thr Ala Lys Lys Phe Glu 
225 230 235 240 

Gly Met Arg Pro Glu Met Asp Ala Ser Gly Phe Arg Asp lie Asp Ala 
245 250 255 

Thr lie Asn Thr Arg Glu Leu Ala Tyr Met Met Lys Lys Ala Gly lie 
260 265 270 

Asp Leu Pro Lys lie Ala Asn Gly Lys Arg Asp Ala Val Met Gly Glu 
275 280 285 

Ser Thr Gly Gly Ala Thr lie Phe Gly val ser Gly Gly Val Met Glu 
290 295 300 

Ala Ala Leu Arg Phe Ala Tyr Gin Ala Leu Thr Lys Lys Pro Pro Gin 
305 310 315 320 

ser Trp Asp Phe Lys Ala val Arg Gly Leu Asn Gly lie Lys Glu Ala 
325 330 335 

Thr lie Asn He Gly Gly Thr Asp Val Lys Val Ala Val Val Asn Gly 
340 345 350 

Gly Lys Asn Phe Ala Lys val cys Asp Glu val Lys Ala Gly Lys ser 
355 360 365 

Pro Tyr His Phe lie Glu Phe Met Ala Cys Pro Gly Gly Cys val Met 
370 375 380 

Gly Gly Gly Gin Pro lie Met Pro Thr Val Leu Glu Ser Met Asn Arg 
385 390 395 400 

Thr Thr Thr Lys Phe Tyr Ala ser Leu Lys Lys Arg Leu Ala Leu Tyr 
405 410 415 

Asp Ala Gin Lys Ala 
420 
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<210> 16 
<211> 608 
<212> PRT 

<213> Thermotoga maritima 
<400> 16 

Met Arg Arg Phe Phe Lys Asn Asn Leu Arg Asn Leu Ser Gin Asn Gly 
1 5 10 15 

Glu Thr Asn Ser Val Arg Arg Cys Phe Ala Leu Ala Asp Val Thr val 
20 25 30 

Val lie Asn Gly Arg Thr Leu Thr Val Pro Asp Asn Leu Thr Val lie 
35 40 45 

Glu Ala cys Glu Lys Ala Gly lie Glu lie Pro Ala Leu cys His His 
50 55 60 

Pro Arg Leu Gly Glu ser lie Gly Ala cys Arg val Cys val val Glu 
65 70 75 80 

val Glu Gly Ala Arg Asn Leu Gin Pro Ala cys Val Thr Lys Val Arg 
85 90 95 

Asp Gly Met Val lie Lys Thr ser Ser Asp Arg val Lys Thr Ala Arg 
100 105 "* 110 

Lys Phe Asn Leu Ala Leu Leu Leu ser Glu His Pro Asn Asp Cys Met 
115 120 125 

Thr Cys Glu Ala Asn Gly Arg Cys Glu Phe Gin Asp Leu lie Tyr Lys 
130 135 140 

Tyr Asp Val Glu Pro lie Phe Gly Tyr Gly Thr Lys Glu Gly Leu val 
145 150 155 160 

Asp Arg ser ser Pro Ala He val Arg Asp Leu Ser Lys cys lie Lys 
165 170 175 

cys Gin Arg cys val Arg Ala cys ser Glu Leu Gin Gly Met His lie 
180 ~ 185 190 

Tyr Ser Met Val Glu Arg Gly His Arg Thr Tyr Pro Gly Thr Pro Phe 
195 200 205 

Asp Met Pro Val Tyr Glu Thr Asp Cys lie Gly Cys Gly Gin Cys Ala 
210 215 220 

Ala Phe cys Pro Thr Gly Ala lie Val Glu Asn Ser Ala val Lys val 
225 230 235 240 

Val Leu Glu Glu Leu Glu Lys Lys Glu Lys lie Leu Val val Gin Thr 
245 250 255 

Ala Pro Ser Val Arg Val Ala lie Gly Glu Glu Phe Gly Tyr Ala Pro 
260 265 270 
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Gly Thr lie ser Thr Gly Gin Met Val Ala Ala Leu Arg Arg Leu Gly 
275 280 285 

Phe Asp Tyr val Phe Asp Thr Asn Phe Gly Ala Asp Leu Thr lie Met 
290 295 300 

Glu Glu Gly Ser Glu Phe Leu Glu Arg Leu Glu Lys Gly Asp Leu Glu 
305 310 315 320 

Asp Leu Pro Met Phe Thr ser Cys cys Pro Gly Trp Val Asn Leu val 
325 330 335 

Glu Lys Val Tyr Pro Glu Leu Arg Thr Arg Leu Ser Ser Ala Lys Ser 
340 345 350 

Pro Gin Gly Met Leu Ser Ala Met val Lys Thr Tyr Phe Ala Glu Lys 
355 360 365 

Leu Gly Val Lys Pro Glu Asp lie Phe His val Ser lie Met Pro Cys 
370 375 380 

Thr Ala Lys Lys Asp Glu Ala Leu Arg Lys Gin Leu Met val Asn Gly 
385 390 395 400 

val Pro Ala val Asp val Val Leu Thr Thr Arg Glu Leu Gly Lys Leu 
405 410 ~ 415 

lie Arg Met Lys Lys lie Pro Phe Ala Asn Leu Pro Glu Glu Glu Tyr 
420 425 430 

Asp Ala Pro Leu Gly lie Ser Thr Gly Ala Ala Ala Leu Phe Gly val 
435 440 445 

Thr Gly Gly val Met Glu Ala Ala Leu Arg Thr Ala Tyr Glu Leu Lys 
450 455 460 

Thr Gly Lys Ala Leu Pro Lys lie Val Phe Glu Glu Val Arg Gly Leu 
465 470 475 480 

Lys Gly Val Arg Glu Ala Glu lie Asp Leu Asp Gly Lys Lys lie Arg 
485 490 495 

lie Ala Val Val His Gly Thr Ala Asn Val Arg Asn Leu Val Glu Lys 
500 505 510 

lie Leu Arg Arg Glu Val Lys Tyr His Phe Val Glu val Met Ala Cys 
515 520 525 

Pro Gly Gly cys lie Gly Gly Gly Gly Gin Pro Tyr Ser Arg Asp Pro 
530 535 540 

Glu lie Leu Arg Lys Arg Ala Glu Ala lie Tyr Thr lie Asp Glu Arg 
545 550 555 560 

Met Thr Leu Arg Lys Ser His Glu Asn Pro Ala lie Lys Lys Leu Tyr 
565 570 575 
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Glu Glu Tyr Leu Glu His Pro Leu Ser His Lys Ala His Glu Leu Leu 
580 585 590 

His Thr Tyr Tyr Glu Asp Arg ser Arg Lys Lys Arg Leu Ala val Lys 
595 600 605 

<210> 17 
<211> 645 
<212> PRT 

<213> Thermotoga maritima 
<400> 17 

Met Lys He Tyr val Asp Gly Arg Glu Val lie lie Asn Asp Asn Glu 
15 10 15 

Arg Asn Leu Leu Glu Ala Leu Lys Asn Val Gly lie Glu lie Pro Asn 
20 25 30 

Leu cys Tyr Leu Ser Glu Ala ser lie Tyr Gly Ala Cys Arg Met Cys 
35 40 45 

Leu Val Glu lie Asn Gly Gin lie Thr Thr Ser cys Thr Leu Lys Pro 
50 55 60 

Tyr Glu Gly Met Lys val Lys Thr Asn Thr Pro Glu lie Tyr Glu Met 
65 70 75 80 

Arg Arg Asn lie Leu Glu Leu lie Leu Ala Thr His Asn Arg Asp cys 
85 90 95 

Thr Thr cys Asp Arg Asn Gly ser Cys Lys Leu Gin Lys Tyr Ala Glu 
100 ~ 105 110 

Asp Phe Gly lie Arg Lys lie Arg Phe Glu Ala Leu Lys Lys Glu His 
115 120 125 

Val Arg Asp Glu Ser Ala Pro Val val Arg Asp Thr ser Lys cys lie 
130 135 140 

Leu cys Gly Asp cys val Arg val Cys Glu Glu lie Gin Gly Val Gly 
145 150 155 160 

Val lie Glu Phe Ala Lys Arg Gly Phe Glu Ser Val val Thr Thr Ala 
165 ~ 170 175 

Phe Asp Thr Pro Leu lie Glu Thr Glu Cys val Leu cys Gly Gin Cys 
180 185 190 

val Ala Tyr Cys Pro Thr Gly Ala Leu Ser lie Arg Asn Asp lie Asp 
195 200 205 

Lys Leu lie Glu Ala Leu Glu Ser Asp Lys lie Val lie Gly Met lie 
210 215 220 

Ala Pro Ala Val Arg Ala Ala lie Gin Glu Glu Phe Gly lie Asp Glu 
225 230 235 240 
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Asp Val Ala Met Ala Glu Lys Leu val Ser Phe Leu Lys Thr lie Gly 
245 250 255 

Phe Asp Lys Val Phe Asp Val Ser Phe Gly Ala Asp Leu Val Ala Tyr 
260 265 270 

Glu Glu Ala His Glu Phe Tyr Glu Arg Leu Lys Lys Gly Glu Arg Leu 
275 280 285 

Pro Gin Phe Thr Ser Cys Cys Pro Ala Trp Val Lys His Ala Glu His 
290 295 300 

Thr Tyr Pro Gin Tyr Leu Gin Asn Leu Ser Ser val Lys ser Pro Gin 
305 310 315 320 

Gin Ala Leu Gly Thr Val lie Lys Lys lie Tyr Ala Arg Lys Leu Gly 
325 330 335 

val Pro Glu Glu Lys lie Phe Leu Val Ser Phe Met Pro cys Thr Ala 
340 345 350 

Lys Lys Phe Glu Ala Glu Arg Glu Glu His Glu Gly lie val Asp lie 
355 360 365 

val Leu Thr Thr Arg Glu Leu Ala Gin Leu lie Lys Met ser Arg lie 

370 ~* 375 380 

Asp lie Asn Arg val Glu Pro Gin Pro Phe Asp Arg Pro Tyr Gly val 
385 390 395 400 

ser ser Gin Ala Gly Leu Gly Phe Gly Lys Ala Gly Gly val Phe Ser 
405 410 415 

Cys Val Leu ser Val Leu Asn Glu Glu lie Gly lie Glu Lys Val Asp 
420 425 430 

val Lys Ser Pro Glu Asp Gly lie Arg val Ala Glu val Thr Leu Lys 
435 440 ~ 445 

Asp Gly Thr Ser Phe Lys Gly Ala val lie Tyr Gly Leu Gly Lys Val 
450 455 460 

Lys Lys Phe Leu Glu Glu Arg Lys Asp Val Glu lie lie Glu val Met 
465 470 475 480 

Ala Cys Asn Tyr Gly Cys Val Gly Gly Gly Gly Gin Pro Tyr Pro Asn 
485 490 495 

Asp ser Arg lie Arg Glu His Arg Ala Lys Val Leu Arg Asp Thr Met 
500 505 510 

Glv lie Lys Ser Leu Leu Thr Pro Val Glu Asn Leu Phe Leu Met Lys 
515 520 525 

Leu Tyr Glu Glu Asp Leu Lys Asp Glu His Thr Arg His Glu lie Leu 
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His Thr Thr Tyr Arg Pro Arg Arg Arg Tyr Pro Glu Lys Asp Val Glu 
545 550 555 560 



lie Leu Pro val Pro Asn Gly Glu Lys Arg Thr val Lys val cys Leu 
565 570 575 



Gly Thr ser cys Tyr Thr Lys Gly ser Tyr Glu lie Leu Lys Lys Leu 
580 585 590 



Val Asp Tyr Val Lys Glu Asn Asp Met Glu Gly Lys lie Glu Val Leu 
595 600 605 



Gly Thr Phe Cys val Glu Asn cys Gly Ala ser Pro Asn Val lie val 
610 615 620 



Asp Asp Lys lie lie Gly Gly Ala Thr Phe Glu Lys Val Leu Glu Glu 
625 630 635 640 



Leu ser Lys Asn Gly 



<210> 18 

<211> 1206 

<212> PRT 

<213> Nyctotherus oval is 

<400> 18 

Met lie ser Arg Leu lie Ala Lys Lys Ala Pro Leu Phe Leu Arg Thr 
1 5 10 15 



Phe Ala Thr Ser Glu Met lie Ser Leu Lys lie Asp Gly Lys lie lie 
20 25 30 



Ser val Pro Lys Gly lie Met Leu Ala Asp Ala lie Lys Lys Ala Gly 
35 40 45 



Ala Asn Val Pro Thr Met Cys Tyr His Pro Asp Leu Pro Thr Ser Gly 
50 55 60 



Gly lie Cys Arg Val Cys Leu Val Glu Ser Ala Lys Ser Pro Gly Tyr 
65 70 75 80 



Pro lie lie ser cys Arg Thr Pro Val Glu Glu Gly Met Glu He val 
85 ~ 90 95 



Thr Gin Gly Ser Lys Met Lys Glu Tyr Arg Gin Ala Asn Leu Ala Leu 
100 105 110 



Met Leu Ser Arg His Pro Asn Ala Cys Leu Ser Cys Thr Ser Asn Thr 
115 ~ 120 125 



Asn cys Lys Thr Gin Glu Leu Ser Ala Asn Met Asn lie Gly Gin Cys 
130 135 140 



Gly Phe Ala Asn Ala Thr Pro Pro Lys Asn Asp Asp Ser Tyr Asp Met 
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Thr Thr Ala lie Glu Arg Asp Asn Asp Lys Cys lie Asn cys Asp lie 
165 170 175 

cys val His Thr cys Ser Leu Gin Gly Leu Asn Ala Leu Gly Phe Tyr 
180 185 190 

Asn Glu Glu Gly His Ala Val Lys Ser Met Gly Thr Leu Asp Val Ser 
195 200 205 

Glu Cys lie Gin cys Gly Gin Cys lie Asn Arg cys Pro Thr Gly Ala 
210 215 220 

lie Thr Glu Lys Ser Glu lie Arg Pro Val Leu Asp Ala lie Asn lie 
225 230 235 240 

Gin Gin Arg Leu Val Phe Gin Met Ala Pro Ser lie Arg Val Ala Val 
245 250 255 

Ala Glu Glu Phe Gly lie Lys Pro Gly Glu Lys lie Leu Lys Asn Glu 
260 265 270 

lie Ala Thr Ala Leu Arg Lys Leu Gly ser Asn Val Phe Val Leu Asp 
275 280 285 

Thr Asn Phe Ser Ala Asp Leu Thr lie lie Glu Glu Gly His Glu Leu 
290 295 300 

lie Glu Arg Leu Tyr Arg Asn Val Thr Gly Lys Lys Leu Leu Gly Gly 
305 310 315 320 

Asp His Met Pro lie Asp Leu Pro Met Leu Thr Ser cys Cys Pro Gly 
325 330 335 

Trp lie Met Phe lie Glu Lys Asn Tyr Pro Asp Leu Leu Asn Asn Leu 
340 345 350 

Ser Thr Cys Lys Ser Pro Gin Gly Met Leu Gly Ala Leu He Lys Gly 
355 360 365 

Tyr Trp Ala Lys Asn lie Lys Lys Met Asp Pro Lys Asp lie Val Ser 
370 375 380 

Val Ser lie Met Pro cys Thr Ala Lys Lys Ala Glu Lys Glu Arg Pro 
385 390 395 400 

Gin Leu Arg Gly Asp Glu Gly Tyr Lys Asp Val Asp Tyr lie Leu Thr 
405 410 415 

Thr Arg Glu Leu Ala Lys Met Leu Lys Gin ser Asn lie Asp Leu Ala 
420 425 430 

Lys Met Glu Pro Thr Pro Phe Asp Lys Val Met ser Glu Gly Thr Gly 
435 440 445 
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450 455 460 

Arg Thr Ala Asn Glu Val lie Thr Gly Arg Glu Val Pro Phe Lys Asn 
465 470 475 480 

Leu Asn lie Glu Ala Val Arg Gly Met Glu Gly lie Arg Glu Ala Gly 
485 490 495 

lie Lys Leu Glu Asn Val Leu Asp Lys Tyr Lys Ala Phe Glu Gly Val 
500 505 510 

Thr val Lys val Ala lie Ala His Gly Pro Asn Asn Ala Arg Lys Val 
515 520 525 

Met Asp lie lie Lys Gin Ala Lys Glu Ser Gly Lys Pro Ala Pro Trp 
530 535 540 

His Phe Val Glu Val Met Ala Cys Pro Gly Gly cys lie Gly Gly Gly 
545 550 555 560 

Gly Gin Pro Lys Pro Thr Asn Leu Glu lie Arg Gin Ala Arg Thr Gin 
565 570 575 

Leu Thr Phe Lys Glu Asp Met Asp Leu Pro Leu Arg Lys Ser His Asp 
580 585 590 

Asn Pro Glu lie Lys Ala lie Tyr Glu Asn Tyr Leu Lys Glu Pro Leu 
595 600 605 

Gly His Asn Ser His His Tyr Leu His Thr Thr Tyr ser Ser Gin Lys 
610 615 620 

val Arg Asp Met Asn Leu Tyr Asn Ala Asn Glu Ala Ala Gly Leu Asp 
625 ~ 630 635 640 

Glu lie Leu Ala Lys Tyr Pro Lys Glu Lys Glu Tyr Leu Met Pro lie 
645 650 655 

lie lie Glu Glu His Asp Lys Lys Gly Tyr lie Ser Asp Pro Ser lie 
660 665 670 

Val Lys lie Ser Glu His Leu Gly Met Tyr Pro Ala Gin lie Glu Ser 
675 680 685 

lie Leu ser ser Tyr His Tyr Phe Pro Arg Glu His Thr lie Ala lie 
690 695 700 

Leu Met ser lie Cys val His Cys His Asn Cys Met Met Lys Gly Gin 
705 710 715 720 

Gly Arg Leu Leu Lys Thr lie Gin Glu Thr Tyr Asp lie His Glu Thr 
725 730 735 

His Gly Gly val Ala Lys Asp Gly Ser Phe Thr Leu His Thr Leu Asn 
740 745 750 
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Trp Leu Gly Tyr cys val Asn Asp Ala Pro Ala Met Met He Lys Arg 
755 760 765 

Lys Gly Thr Asn Tyr Val Glu Thr Phe Thr Gly Leu Leu Gly Asp Asn 
770 775 780 

lie Asp Gin Arg Leu Lys Ser Leu Lys Asn Leu Lys Lys Glu Leu Pro 
785 790 795 800 

Lys Trp Pro Lys Asn Asn lie Arg Glu Met Lys Ser Gin Arg Asn Gly 
805 ^ 810 815 

Asn ser Tyr ser cys Met Asn Thr Gin Ala Pro lie Ala Glu Ala Thr 
820 825 830 

Lys Lys Ala val ser Met Gly Pro Glu Lys Val lie Glu Glu val Phe 
835 840 845 

Lys Ser Asn Leu Val Gly Arg Gly Gly Ala Gly Phe Arg Thr Gly Lys 
850 855 860 

Lys Trp Glu Ser Ala Tyr Lys Thr Pro Ala Ser Asp Lys Tyr val Val 
865 870 875 880 

cys Asn Ala Asp Glu Gly Leu Pro ser Thr Tyr Lys Asp Trp Cys Leu 
885 890 895 

Leu Asn Asn Glu Ala Lys Arg Lys Glu Val Phe Thr Gly Met Gly lie 
900 905 910 

Cys Ala Lys Thr lie Gly Ala Lys Arg Cys Phe Met Tyr Leu Arg Tyr 
915 920 925 

Glu Tyr Arg Asn Leu val Pro Ala Leu Glu Gin Ser lie Lys Asp Val 
930 935 , 940 

Gin Ser Thr Cys Pro Glu Leu Ala Asp Leu Lys Tyr Glu lie Arg Leu 
945 950 955 960 

Gly Gly Gly Pro Tyr Val Ala Gly Glu Glu Asn Ala Gin Phe Glu ser 
965 970 975 

lie Glu Gly Arg Ala Pro Leu Pro Arg Lys Asp Arg Pro Gly Asn lie 
980 985 ~ 990 

Phe Pro Thr Met Glu Gly Leu Phe His Lys Pro Thr Val lie Asn Asn 
995 1000 1005 

Val Glu Thr Phe Phe Ala lie Pro His lie lie Gin Gin Gly ser 
1010 1015 1020 

Gin Ser Phe Gly Glu Gly Lys Met Pro Lys Leu Leu Ser Val Thr 
1025 1030 1035 

Gly Asp Val Asp Glu Pro lie Leu lie Glu Thr Asn Leu Asn Asn 
1040 1045 1050 
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Tyr ser Leu Asn His Leu Leu Gin Glu lie ser Ala Lys Asp lie 
1055 1060 1065 

val Ala Ala Glu lie Gly Gly Cys Thr Glu Pro lie lie Phe Gly 
1070 1075 1080 

Ser Lys Phe Asp Thr Leu Phe Gly Phe Gly Arg Gly Thr Leu Asn 
1085 1090 1095 

Ala Val Gly Ser Val Val Leu Phe Asn ser ser cys Asp Leu Gly 
1100 1105 1110 

Lys lie Tyr Glu Asn Lys Leu Lys Phe Met Ala Glu Glu ser Cys 
1115 1120 1125 

Lys Gin cys val Pro cys Arg Asp Gly Ser Tyr lie Phe His Arg 
1130 1135 1140 

Ala Phe Lys Glu Leu Arg Asp Thr Gly Lys Ser ser Tyr Asn Met 
1145 1150 1155 

Arg Ala Leu Ala Val Ala Ser Glu Ser Ala Ala Arg ser Ser lie 
1160 1165 1170 

cys Ala His Gly Lys Ala Leu Glu ser Leu phe Lys ser Ala cys 
1175 1180 1185 

Asp Phe Met Asn Lys Thr Lys Pro lie Tyr Gin Pro His Ser Thr 
1190 1195 1200 

Tyr His Gin 
1205 

<210> 19 
<211> 467 
<212> PRT 

<213> spironucleus barkhanus 
<400> 19 

Met Lys val Arg Gin ser Pro Phe Lys lie Asp lie Thr Asn Gly Pro 
15 10 15 

lie Asp Arg Asn Asp Ala lie Gin lie Asp Tyr Gin Lys cys lie Gly 
20 25 30 

Cys Gin Met Cys Ala Lys Thr Cys Thr Asp ser Gin Asn Phe Asn lie 
35 40 45 

Phe Lys lie Ser Ala Pro Lys Thr Lys Pro Phe Val Asn Ala Tyr Gly 
50 55 60 

ser Val Ala Glu Gly Thr Glu Arg Asn Ala Leu Ala Gly Thr Asp cys 
65 70 ~ 75 80 

Thr Gly cys Gly Ala cys Val Arg Ala Cys Pro Val Glu Ala Leu Met 
85 90 95 
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Pro Ala Phe Asn lie Arg pro Val Leu Glu Pro lie ser Glu Lys Lys 
100 105 110 

Lys Val Thr lie Ala Val lie Ala Pro Ser Thr Arg Val Gly Leu Ala 
115 120 125 

Glu Gly Met Gly Met Gly Val Gly Val Thr Ala Glu Arg Gin Met Val 
130 135 140 

Tyr Glu Leu Lys Gin Met Gly Phe Asp Tyr Val Phe Asp Asn Met Trp 
145 150 155 160 

Gly Ala Asp Ala Pro Thr Thr Glu Asp Ala Lys Glu lie Leu Lys Ala 
165 170 175 

Lys Ala Ala Gly Lys Thr Ala Phe Thr ser Cys Cys Pro Ala Trp val 
180 185 190 

Lys Leu Val Glu Thr Thr Tyr Pro Glu Leu Leu Pro Asn lie ser ser 
195 200 205 

Ala Arg Ser Pro His Gly lie lie Cys Ser Val lie Lys Lys Tyr Phe 
210 215 220 

Ala Lys Asp lie Gly Lys Lys Ala Asp Glu Leu Tyr Val Val Gly Val 
225 230 235 240 

Met Pro cys Thr Ala Lys Lys Asn Glu Ala Ala Arg Lys Glu Leu Thr 
245 250 255 

Thr Asp Gly ser Pro Asp cys Asp lie ser lie Thr Thr Arg Glu Leu 
260 265 270 

Met Ala Tyr Leu Lys Glu Lys Lys Val Thr Phe Ser Ala Ala Arg Glu 
275 280 285 

lie Glu Leu Lys Asp Asn Val Gin Ala Gin Tyr Asp Ala Pro Phe Asn 
290 295 300 

Thr Phe Ser Gly Ser Ala Tyr lie Tyr Gly Lys Thr Ala Gly Val Thr 
305 310 315 320 

Glu Ala Val val Arg Tyr Val cys Ala lie Lys Lys Val Pro Phe Ser 
325 330 335 

Val Gly Met lie Thr Lys Glu Leu lie Trp Glu Asn Lys Leu His ser 
340 345 350 

Ser Ser Leu Thr Leu Leu Thr Phe Ser Ala Ala Gly Glu Asp Tyr Arg 
355 360 365 

lie Cys Val Ser Tyr Gly Gly Leu Ala Ala His Lys Ala Val Glu Leu 
370 375 380 

Tyr Lys ser Gly Glu Leu Lys Val Asp Ala val Glu val Met Val Cys 
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Pro Gly Gly Cys Val Gly Gly Gly Gly Gin Pro Lys Gin Pro Lys Lys 
405 410 415 

Asp Met lie Leu Lys Arg His Glu Gly Leu Asp Lys His Asp Lys Glu 
420 425 430 

Ala Pro Tyr Ser Asn Cys Thr Glu Asn Pro Thr Leu Asn Glu Phe Tyr 
435 440 445 

Glu Arg lie Gly Thr Asp Val His His val Met His Thr Thr Tyr ser 
450 455 460 

Ala Tyr Lys 
465 

<210> 20 
<211> 468 
<212> PRT 

<213> Trichomonas vaginalis 
<400> 20 

Met Leu Ala Ser ser Ala Thr Ala Met Lys Gly Phe Ala Asn ser Leu 
15 10 15 

Arg Met Lys Asp Tyr Ser Ser Thr Gly lie Asn Phe Asp Met Thr Lys 
20 25 30 

Cys lie Asn Cys Gin Ser Cys Val Arg Ala cys Thr Asn lie Ala Gly 
35 40 45 

Gin Asn Val Leu Lys Ser Leu Thr Val Asn Gly Lys Ser Val Val Gin 
50 55 60 

Thr Val Thr Gly Lys Pro Leu Ala Glu Thr Asn cys lie ser cys Gly 
65 70 75 80 

Gin cys Thr Leu Gly cys Pro Lys Phe Thr lie Phe Glu Ala Asp Ala 
85 90 95 

lie Asn Pro Val Lys Glu val Leu Thr Lys Lys Asn Gly Arg lie Ala 
100 105 110 

Val Cys Gin lie Ala Pro Ala lie Arg lie Asn Met Ala Glu Ala Leu 
115 120 ~ 125 

Gly val Pro Ala Gly Thr lie Ser Leu Gly Lys val val Thr Ala Leu 
130 135 140 

Lys Arg Leu Gly Phe Asp Tyr Val Phe Asp Thr Asn Phe Ala Ala Asp 
145 150 155 160 

Met Thr lie val Glu Glu Ala Thr Glu Leu val Gin Arg Leu Ser Asp 
165 170 175 

Lys Asn Ala Val Leu Pro Met Phe Thr ser Cys cys Pro Ala Trp val 
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Asn Tyr Val Glu Lys Ser Asp Pro Ser Leu He Pro Tyr Leu Ser Ser 
195 200 205 

cys Arg ser Pro Met Ser Met Leu ser ser val lie Lys Asn Val Phe 
210 215 220 

Pro Lys Lys He Gly Thr Thr Ala Asp Lys lie Tyr Asn val Ala lie 
225 230 235 240 

Met Pro Cys Thr Arg Lys Lys Asp Glu He Gin Arg ser Gin Phe Thr 
245 250 255 

Met Lys Asp Gly Lys Gin Glu Thr Gly Ala Val Leu Thr Ser Arg Glu 
260 265 270 

Leu Ala Lys Met lie Lys Glu Ala Lys He Asn Phe Lys Glu Leu Pro 
275 280 285 

Asp Thr Pro Cys Asp Asn Phe Tyr ser Glu Ala ser Gly Gly Gly Ala 
290 295 300 

lie Phe cys Ala Thr Gly Gly Val Met Glu Ala Ala Val Arg ser Ala 
305 310 315 320 

Tyr Lys Phe Leu Thr Lys Lys Glu Leu Ala Pro lie Asp Leu Gin Asp 
325 330 335 

val Arg Gly val Ala Ser Gly val Lys Leu Ala Glu val Asp lie Ala 
340 345 350 

Gly Thr Lys Val Lys Val Ala val Ala His Gly lie Lys Asn Ala Met 
355 360 365 

Thr Leu lie Lys Lys lie Lys Ser Gly Glu Glu Gin Phe Lys Asp val 
370 375 380 

Lys Phe val Glu Val Met Ala Cys Pro Gly Gly Cys val Val Gly Gly 
385 390 395 400 

Gly ser Pro Lys Ala Lys Thr Lys Lys Ala Val Gin Ala Arg Leu Asn 
405 410 415 

Ala Thr Tyr Ser lie Asp Lys ser Ser Lys His Arg Thr Ser Gin Asp 
420 425 430 

Asn Pro Gin Leu Leu Gin Leu Tyr Lys Glu ser Phe Glu Gly Lys Phe 
435 440 445 

Gly Gly His val Ala His His Leu Leu His Thr His Tyr Lys Asn Arg 
450 455 460 

Lys Val Asn Pro 
465 
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<211> 449 
<212> PRT 

<213> Trichomonas vaginalis 
<400> 21 

Met Leu Ala ser ser ser Arg Ala Ala Ala Asn lie Arg Trp Val Asp 
1 5 10 15 

Thr ser His Asn Ala lie Ala Phe Asp Met His Lys Cys lie Asn cys 
20 25 30 

Gin Ala Cys Val Arg Ala Cys Lys Asn val Ala Gly Gin ser val Leu 
35 40 45 

Lys ser Val Lys lie Asn Glu Gly Lys Lys Lys Gly val Val Gin Thr 
50 55 60 

val Thr Gly Lys Leu Leu Ala Glu Thr Asn cys lie Gly cys Gly Gin 
65 70 75 80 

Cys Thr Leu val cys Pro Thr Gin Ala lie His Glu Lys Asp Ala Leu 
85 90 95 

Lys Gin Met Asn Asn lie Phe Lys Asn Lys Gly Asp Arg lie Leu val 
100 105 110 

cys Gin lie Ala Pro Ala lie Arg lie Asn Met Arg Arg Pro Trp cys 
115 120 125 

Ser Ser Arg Asn Ser Phe His Arg Gin Ser Arg Tyr Ser Pro Gin Arq 
130 135 " 140 

Leu Gly Phe Asp Tyr Val Phe Asp Thr Asn Phe Gly Ala Asp Leu Thr 
145 150 155 160 

lie val Glu Glu Ala Thr Glu Leu Leu Gin Arg Leu Asn Asp Pro Lys 
165 170 175 

Ala Val Leu Pro Met Phe Thr Ser Cys Cys pro Ala Trp val Asn Tvr 
180 185 190 

Val Glu Lys ser Tyr Pro Gin Trp Met Pro His Leu Ser Thr cys Arg 
195 200 205 

Ser Pro lie Gly Met Leu Ser Ala val lie Lys Asn Val Phe Pro Lys 
210 215 220 

His lie Gly Val Asp Pro Lys Arg lie Phe ser val Gly lie Met Pro 
225 230 " 235 240 

cys Thr Ala Lys Lys Asp Glu Ala Ala Arg Glu Gin Leu Met Thr Lys 
245 250 255 

Ser Gly Leu His Glu Thr Asp Leu Asp lie Thr ser Arg Glu Leu Ala 
260 265 270 
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275 280 285 

Glu Leu Asp ser Pro Tyr Ala Met Ala Thr Gly Gly Gly Ala lie Phe 
290 295 300 

cys Ala Thr Gly Gly Val Met Glu Ala Ala Val Arg ser Ala Tyr Lys 
305 310 315 320 

Phe Ala Thr Gly Lys Glu Leu Ala Pro lie Glu Phe val Gin Val Arg 
325 330 335 

Gly Ala Glu Lys Gly lie Lys val Gly Thr Val Asp lie Asn Gly Arg 
340 345 350 

Glu lie Lys val Ala Val Ala Gin Gly val Lys Asn Ala Met ser Leu 
355 360 365 

lie Lys Lys lie Glu Glu Gly Gin Asp Asp val Lys Gly val val Phe 
370 375 380 

Cys Glu Val Met Ala Cys Pro Gly Gly cys Val Gly Gly Gly Gly Ser 
385 390 395 400 

Pro Arg Ala Lys Thr Lys Ala Ala Met Asn Lys Arg Leu Asp Ala Thr 
405 410 415 

Tyr Arg He Asp Arg Ala Ser Lys Tyr Arg Thr Pro Gin Asp Asn Thr 
420 425 430 

Gin Leu Gin Asp Leu Tyr Asn Ala Thr Trp val Val ser Leu val Met 
435 440 445 

Asp 



<210> 22 
<211> 589 
<212> PRT 

<213> Trichomonas vaginalis 
<400> 22 

Ala Ser Thr Gly lie Asn ser Thr Ala Asn lie Leu Arg Asn lie Thr 
15 10 15 

Val Thr Val Asn Gly Lys Pro Leu Glu Ala Lys Lys Gly Glu Thr Val 
20 25 30 

Leu Glu Leu Cys Asp Arg Asn Asn He Arg lie Pro Arg Leu Cys Phe 
35 40 45 

His Pro Asn Leu Pro Pro Lys Ala Ser cys Arg Val Cys Leu val Glu 
50 55 60 

cys Asp Gly Lys Trp Leu Ser Pro Ala Cys Val Thr Thr val Trp Asp 
65 70 75 80 
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85 90 95 

Asn Asn Leu Lys Glu Leu Leu Asp cys His Asp Glu Thr cys ser Ala 
100 105 110 

Cys He Ala Asn His Arg cys Gin Phe Arg Asp Met Asn Val Ala Tyr 
115 120 125 

Ser val Lys Ala Glu Thr Lys Glu lie Cys ser Glu Glu Gly lie Asp 
130 135 140 

Glu Ser Thr Asn Ala lie Arg Leu Asp Thr Ser Lys cys Val Leu Cys 
145 150 ~ 155 160 

Gly Arg Cys lie Arg Ala Cys Glu Glu Val Ala Gly Thr Ser Ala He 
165 170 175 

lie Phe Gly Asn Arg Ala Lys Lys Met Arg lie Gin Pro Thr Phe Gly 
180 185 190 

val Thr Leu Gin Glu Thr Ser cys lie Lys cys Gly Gin cys Thr Leu 
195 200 205 

Tyr cys Pro val Gly Ala lie Thr Glu Lys Ser Gin val Lys Glu Ala 
210 215 220 

Leu Asp lie Leu Ala Asn Lys Gly Lys Lys lie Thr Val Val Gin val 
225 230 235 240 

Ala Pro Ala Val Arg val Ala Leu ser Glu Ala Phe Gly Tyr Lys Glu 
245 250 255 

Gly Thr val Thr Thr Gly Lys Met Val Ser Ala Leu Lys Ala Leu Gly 
260 265 270 

Phe Asp Leu Val Tyr Asp Thr Asn Tyr Gly Ala Asp Leu Thr lie cys 
275 280 285 

Glu Glu Ala Gly Glu Leu val Asn Arg Leu Arg Asp Pro Asn Ala Lys 
290 295 " 300 

Phe Pro Met Phe Thr Thr Cys Cys Pro Ala Trp Val Asn Tyr Val Glu 
305 310 315 320 

Gin ser Ala Pro Asp Phe lie Pro Asn Leu ser ser cys Arg ser Pro 
325 330 335 

Gin Gly Met Leu ser Ala Leu lie Lys Asn Tyr Leu Pro Lys Leu Leu 
340 345 350 

Asp Val Lys Gin Glu Asp val Leu Asn Phe ser lie Met Pro cys Thr 
355 360 365 

Ala Lys Lys Asp Glu Val Glu Arg Pro Glu Leu Arg Thr Lys ser Gly 
370 375 " 380 
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Leu Lys Glu Thr Asp Met val Leu Thr Val Arg Glu Leu Val Glu Met 
385 390 395 400 

He Lys Leu ser Asn lie Asp Phe Asn Asn Leu Pro Asp Thr Gin Phe 
405 410 415 

Asp Asn lie Phe Gly Phe Gly ser Gly Ala Gly Gin lie Phe Ala Ala 
420 425 430 

Thr Gly Gly Val Met Glu Ala Ala Ser Arg Thr Ala Phe Glu Val Tyr 
435 440 ~ 445 

Thr Gly Lys Lys Leu Thr Asn Val Asn lie Tyr Pro val Arg Gly Met 
450 455 460 

Asp Gly Leu Arg lie Ala Glu Leu Asp Leu Asp Gly Thr Lys Leu Lys 
465 470 475 480 

Val Ala Val cys His Gly lie Ala Asn Thr Ala Lys Leu Leu Asp Arq 
485 490 * 495 

Leu Arg Glu Lys Asp Pro Glu Leu Met Asp lie Lys Phe lie Glu lie 
500 505 510 

Met Ala Cys Pro Gly Gly Cys Val Cys Gly Gly Gly Thr Pro Gin Pro 
515 520 525 

Lys Asn Arg val Ser Leu Asp Asn Arg Leu Ala Ala lie Tyr Asn lie 
530 535 540 

Asp Ala Lys Met Glu Cys Arg Lys Ser His Glu Asn Pro Leu lie Lys 
545 550 555 560 

Gly val Tyr Lys Glu Phe Leu Gly Lys Pro Asn ser His Leu Ala His 
565 570 575 

Glu Leu Leu His Thr His Phe Lys His His Pro Lys Trp 
580 585 

<210> 23 
<211> 582 
<212> PRT 

<213> Trichomonas vaginalis 
<400> 23 

Met Lys Thr lie lie Leu Asn Gly Asn Glu val His Thr Asp Lys Asp 
1 5 10 15 

lie Thr lie Leu Glu Leu Ala Arg Glu Asn Asn val Asp lie Pro Thr 
20 25 30 

Leu Cys Phe Leu Lys Asp cys Gly Asn Phe Gly Lys cys Gly val cys 
35 40 45 

Met val Glu val Glu Gly Lys Gly Phe Arg Ala Ala Cys val Ala Lys 
50 55 60 
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val Glu Asp Gly Met Val lie Asn Thr Glu ser Asp Glu Val Lys Glu 
65 70 75 80 

Arg He Lys Lys Arg Val Ser Met Leu Leu Asp Lys His Glu Phe Lys 
85 90 95 

cys Gly Gin cys Ser Arg Arg Glu Asn cys Glu Phe Leu Lys Leu Val 
100 105 110 

lie Lys Thr Lys Ala Lys Ala Ser Lys Pro Phe Leu Pro Glu Asp Lys 
115 120 125 

Asp Ala Leu Val Asp Asn Arg Ser Lys Ala lie Val lie Asp Arg ser 
130 135 140 

Lys cys val Leu Cys Gly Arg Cys Val Ala Ala cys Lys Gin His Thr 
145 150 155 160 

Ser Thr cys Ser lie Gin Phe lie Lys Lys Asp Gly Gin Arg Ala val 
165 170 175 

Gly Thr val Asp Asp Val cys Leu Asp Asp Ser Thr Cys Leu Leu cys 
180 185 190 

Gly Gin cys Val lie Ala Cys Pro Val Ala Ala Leu Lys Glu Lys Ser 
195 200 205 

His lie Glu Lys Val Gin Glu Ala Leu Asn Asp Pro Lys Lys His val 
210 215 220 

lie val Ala Met Ala Pro ser val Arg Thr Ala Met Gly Glu Leu Phe 
225 230 235 240 

Lys Met Gly Tyr Gly Lys Asp Val Thr Gly Lys Leu Tyr Thr Ala Leu 
245 250 255 

Arg Met Leu Gly Phe Asp Lys Val Phe Asp He Asn Phe Gly Ala Asp 
260 265 270 

Met Thr lie Met Glu Glu Ala Thr Glu Leu Leu Gly Arg Val Lys Asn 

275 280 285 

Asn Gly Pro Phe Pro Met Phe Thr Ser Cys cys Pro Ala Trp Val Arg 
290 295 300 

Leu Ala Gin Asn Tyr His Pro Glu Leu Leu Asp Asn Leu ser ser Ala 
305 310 315 320 

Lys Ser Pro Gin Gin lie Phe Gly Thr Ala Ser Lys Thr Tyr Tyr Pro 
325 330 335 

Ser lie Ser Gly lie Ala Pro Glu Asp Val Tyr Thr val Thr lie Met 
340 345 350 

Pro cys Asn Asp Lys Lys Tyr Glu Ala Asp lie Pro Phe Met Glu Thr 
355 360 365 
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Asn ser Leu Arg Asp lie Asp Ala ser Leu Thr Thr Arg Glu Leu Ala 
370 375 380 

Lys Met lie Lys Asp Ala Lys lie Lys Phe Ala Asp Leu Glu Asp Gly 
385 390 395 400 

Glu val Asp Pro Ala Met Gly Thr Tyr ser Gly Ala Gly Ala lie Phe 
405 410 415 

Gly Ala Thr Gly Gly Val Met Glu Ala Ala lie Arg ser Ala Lys Asp 
420 425 ~ 430 

Phe Ala Glu Asn Lys Glu Leu Glu Asn Val Asp Tyr Thr Glu Val Arg 
435 440 445 

Gly Phe Lys Gly He Lys Glu Ala Glu val Glu lie Ala Gly Asn Lys 
450 455 460 

Leu Asn val Ala Val lie Asn Gly Ala ser Asn Phe Phe Glu Phe Met 
465 470 475 480 

Lys ser Gly Lys Met Asn Glu Lys Gin Tyr His Phe lie Glu val Met 
485 490 495 

Ala cys Pro Gly Gly Cys lie Asn Gly Gly Gly Gin Pro His val Asn 
500 505 510 

Ala Leu Asp Arg Glu Asn val Asp Tyr Arg Lys Leu Arg Ala Ser val 
515 520 525 

Leu Tyr Asn Gin Asp Lys Asn val L,eu Ser Lys Arg Lys Ser His Asp 
530 535 540 

Asn Pro Ala lie lie Lys Met Tyr Asp ser Tyr Phe Gly Lys Pro Gly 
545 550 555 560 

Glu Gly Leu Ala His Lys Leu Leu His val Lys Tyr Thr Lys Asp Lys 
565 570 575 

Asn val Ser Lys His Glu 
580 

<210> 24 
<211> 497 
<212> PRT 

<213> chlamydomonas reinhardtii 
<400> 24 

Met Ser Ala Leu Val Leu Lys Pro cys Ala Ala Val ser lie Arg Gly 
1 5 10 15 

Ser Ser cys Arg Ala Arg Gin Val Ala Pro Arg Ala Pro Leu Ala Ala 
20 25 30 

Ser Thr Val Arg Val Ala Leu Ala Thr Leu Glu Ala Pro Ala Arg Arg 
35 40 45 
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Leu Gly Asn val Ala Cys Ala Ala Ala Ala Pro Ala Ala Glu Ala Pro 
50 55 60 

Leu ser His Val Gin Gin Ala Leu Ala Glu Leu Ala Lys Pro Lys Asp 
65 70 75 80 

Asp Pro Thr Arg Lys His Val cys val Gin val Ala Pro Ala val Arg 
85 90 95 

Val Ala lie Ala Glu Thr Leu Gly Leu Ala Pro Gly Ala Thr Thr Pro 
100 105 110 

Lys Gin Leu Ala Glu Gly Leu Arg Arg Leu Gly Phe Asp Glu val Phe 
115 120 ^ 125 

Asp Thr Leu Phe Gly Ala Asp Leu Thr lie Met Glu Glu Gly Ser Glu 
130 135 140 

Leu Leu His Arg Leu Thr Glu His Leu Glu Ala His Pro His ser Asp 
145 150 155 160 

Glu Pro Leu Pro Met Phe Thr Ser Cys cys Pro Gly Trp lie Ala Met 
165 170 175 

Leu Glu Lys Ser Tyr Pro Asp Leu lie Pro Tyr val Ser ser cys Lys 
180 185 190 

ser Pro Gin Met Met Leu Ala Ala Met val Lys ser Tyr Leu Ala Glu 
195 200 205 

Lys Lys Gly lie Ala Pro Lys Asp Met val Met Val ser lie Met Pro 
210 215 220 

cys Thr Arg Lys Gin ser Glu Ala Asp Arg Asp Trp Phe Cys val Asp 
225 230 235 240 

Ala Asp Pro Thr Leu Arg Gin Leu Asp His val lie Thr Thr val Glu 
245 250 255 

Leu Gly Asn lie Phe Lys Glu Arg Gly He Asn Leu Ala Glu Leu Pro 
260 265 270 

Glu Gly Glu Trp Asp Asn Pro Met Gly Val Gly ser Gly Ala Gly val 
275 ?80 285 

Leu Phe Gly Thr Thr Gly Gly val Met Glu Ala Ala Leu Arg Thr Ala 
290 295 300 

Tyr Glu Leu Phe Thr Gly Thr Pro Leu Pro Arg Leu Ser Leu Ser Glu 
305 310 315 320 

val Arg Gly Met Asp Gly lie Lys Glu Thr Asn lie Thr Met Val Pro 
325 330 335 

Ala Pro Gly Ser Lys Phe Glu Glu Leu Leu Lys His Arg Ala Ala Ala 
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Arg Ala Glu Ala Ala Ala His Gly Thr Pro Gly Pro Leu Ala Trp Asp 
355 360 365 

Gly Gly Ala Gly Phe Thr Ser Glu Asp Gly Arg Gly Gly lie Thr Leu 
370 375 380 

Arg val Ala val Ala Asn Gly Leu Gly Asn Ala Lys Lys Leu lie Thr 
385 390 395 400 

Lys Met Gin Ala Gly Glu Ala Lys Tyr Asp Phe Val Glu lie Met Ala 
405 410 415 

cys Pro Ala Gly cys Val Gly Gly Gly Gly Gin Pro Arg Ser Thr Asp 
420 425 430 

Lys Ala lie Thr Gin Lys Arg Gin Ala Ala Leu Tyr Asn Leu Asp Glu 
435 440 445 

Lys ser Thr Leu Arg Arg Ser His Glu Asn Pro Ser lie Arg Glu Leu 
450 ~ 455 460 

Tyr Asp Thr Tyr Leu Gly Glu Pro Leu Gly His Lys Ala His Glu Leu 
465 470 475 480 

Leu His Thr His Tyr val Ala Gly Gly val Glu Glu Lys Asp Glu Lys 
485 490 495 

Lys 



<210> 25 

<211> 415 

<212> PRT 

<213> Chlorella fusca 

<400> 25 

Ala Gly Pro Thr ser Glu cys Asp cys Pro Pro Thr Pro Gin Ala Lys 
1 5 10 15 

Leu Pro His Trp Gin Gin Ala Leu Asp Glu Leu Ala Lys Pro Lys Glu 
20 25 30 

ser Arg Arg Leu Met lie Ala Gin lie Ala Ser Ala Val Arg Val Ala 
35 40 45 

He Ala Glu Thr lie Gly Leu Ala Pro Gly Asp val Thr lie Gly Gin 
50 55 60 

Leu val Thr Gly Leu Arg Met Leu Gly Phe Asp Tyr val Phe Asp Thr 
65 70 75 80 

Leu Phe Gly Ala Asp Leu Thr lie Met Glu Glu Gly Thr Glu Leu Leu 
85 90 95 

His Arg Leu Gin Asp His Leu Glu Gin His Pro Asn Lys Glu Glu Pro 
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Leu Pro Met Phe Thr ser cys Cys Pro Gly Trp val Ala Met Val Glu 
115 120 125 

Lys Ser Asn Pro Glu Leu lie Pro Tyr Leu ser ser cys Lys ser Pro 
130 135 140 

Gin Met Met Leu Gly Ala Val lie Lys Asn Tyr Tyr Ala Gin Gin Val 
145 150 155 160 

Gly Val Gin Pro Ser Asp lie Cys Asn Val Ser val Met Pro cys Val 
165 170 175 

Arg Lys Gin Gly Glu Ala Asp Arg Glu Trp Phe Asn Thr Thr Gly Ala 
180 185 190 

Gly Leu Ala Arg Asp val Asp His val val Thr Thr Ala Glu val Gly 
195 200 205 

Lys lie Phe Leu Glu Arg Gly lie Lys Leu Asn Glu Leu Pro Glu Ser 
210 215 220 

Asn Phe Asp Asn Pro lie Gly Glu Gly Thr Gly Gly Ala Leu Leu Phe 
225 230 235 240 

Gly Thr Thr Gly Gly val Met Glu Ala Ala Leu Arg Thr val Tyr Glu 
245 250 255 

val val Thr Gin Lys Pro Met Gly Arg val Asp Phe Glu Glu val Arg 
260 265 270 

Gly Leu Glu Gly lie Lys Glu Ala Glu He Thr Leu Lys Pro Gly Asp 
275 280 285 

Asp Ser Pro Phe Lys Ala Phe Ala Gly Ala Asp Gly Gin Gly lie Thr 
290 295 300 

Leu Lys lie Ala val Ala Asn Gly Leu Gly Asn Ala Lys Lys Leu lie 
305 310 315 320 

Lys ser Leu ser Glu Gly Lys Ala Lys Tyr Asp Phe lie Glu val Met 
325 330 335 

Ala Cys Pro Gly Gly cys lie Gly Gly Gly Gly Gin Pro Arg ser Thr 
340 345 " 350 

Asp Lys Gin lie Leu Gin Lys Arg Gin Gin Ala Met Tyr Asn Leu Asp 
355 360 365 

Glu Arg Ser Thr lie Arg Arg ser His Asp Asn Pro Phe lie Gin Ala 
370 375 380 

Leu Tyr Asp Lys Phe Leu Gly Ala Pro Asn ser His Lys Ala His Asp 
385 390 395 400 
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405 410 415 

<210> 26 
<211> 505 
<2IL2> PRT 

<2IL3> Chi amydomonas reinhardtii 
<4O0> 26 

Met Ala Leu Gly Leu Leu Ala Glu Leu Arg Ala Gly Gin Ala val Ala 
15 10 15 

cys Ala Arg Arg Thr Asn Ala Pro Ala His Pro Ala Ala Val val Pro 
20 25 30 

Cys Leu Pro Ser Arg Ala Gly Lys Phe Phe Asn Leu ser Gin Lys Val 
35 40 45 

Pro ser ser Gin ser Ala Arg Gly Ser Thr lie Arg val Ala Ala Thr 
50 55 60 

Ala. Thr Asp Ala Val Pro His Trp Lys Leu Ala Leu Glu Glu Leu Asp 
65 70 75 80 

Lys Pro Lys Asp Gly Gly Arg Lys Val Leu He Ala Gin val Ala Pro 
85 90 95 

Ala Val Arg Val Ala lie Ala Glu ser Phe Gly Leu Ala Pro Gly Ala 
100 105 110 

Val Ser Pro Gly Lys Leu Ala Thr Gly Leu Arg Ala Leu Gly Phe Asp 
115 120 125 

Gin val Phe Asp Thr Leu Phe Ala Ala Asp Leu Thr lie Met Glu Glu 
130 135 140 

Gly Thr Glu Leu Leu His Arg Leu Lys Glu His Leu Glu Ala His Pro 
145 150 155 160 

His ser Asp Glu Pro Leu Pro Met Phe Thr ser cys Cys Pro Gly Trp 
165 170 175 

val Ala Met Met Glu Lys Ser Tyr Pro Glu Leu lie pro Phe val ser 
180 185 190 

Ser Cys Lys Ser Pro Gin Met Met Met Gly Ala Met Val Lys Thr Tyr 
195 200 205 

Leu Ser Glu Lys Gin Gly lie Pro Ala Lys Asp lie Val Met val ser 
210 215 220 

Val Met Pro Cys Val Arg Lys Gin Gly Glu Ala Asp Arg Glu Trp Phe 
225 230 235 240 

Cys Val ser Glu Pro Gly val Arg Asp Val Asp His val lie Thr Thr 
245 250 255 
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P»F»tte Lys Glu Arg Gly lie Asn Leu Pro Glu 



265 270 



Leu Pro Asp Ser Asp Trp Asp Gin Pro Leu Gly Leu Gly ser Gly Ala 
275 280 285 



Gly val Leu Phe Gly Thr Thr Gly Gly val Met Glu Ala Ala Leu 
290 295 300 



Thr Ala Tyr Glu lie val Thr Lys Glu Pro Leu Pro Arg Leu Asn Leu 
305 310 315 320 



Ser Glu Val Arg Gly Leu Asp Gly lie Lys Glu Ala ser Val Thr Leu 
325 330 335 



Val Pro Ala Pro Gly Ser Lys Phe Ala Glu Leu Val Ala Glu Arg Leu 
340 345 350 



Ala His Lys Val Glu Glu Ala Ala Ala Ala Glu Ala Ala Ala Ala Val 
355 360 365 



Glu Gly Ala val Lys Pro Pro lie Ala Tyr Asp Gly Gly Gin Gly Phe 
370 375 380 



Ser Thr Asp Asp Gly Lys Gly Gly Leu Lys Leu Arg Val Ala val Ala 
385 390 395 400 



Asn Gly Leu Gly Asn Ala Lys Lys Leu lie Gly Lys Met val Ser Gly 
405 410 415 



Glu Ala Lys Tyr Asp Phe val Glu lie Met Ala cys Pro Ala Gly cys 
420 425 430 



Val Gly Gly Gly Gly Gin Pro Arg ser Thr Asp Lys Gin lie Thr Gin 
435 440 445 



Lys Arg Gin Ala Ala Leu Tyr Asp Leu Asp Glu Arg Asn Thr Leu 
450 455 460 



Arg ser His Glu Asn Glu Ala Val Asn Gin Leu Tyr Lys Glu Phe Leu 
465 470 475 480 



Gly Glu pro Leu ser His Arg Ala His Glu Leu Leu His Thr His Tyr 
485 ~ 490 495 



Val Pro Gly Gly Ala Glu Ala Asp Ala 
500 505 



<210> 27 
<211> 403 
<212> PRT 

<213> Scenedesmus obliquus 
<400> 27 

Pro His Trp Gin Gin Thr Leu Asp Glu Leu Ala Lys Pro Lys Glu Arg 



1 



5 



10 



15 
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LsSs^aa^TeUTt^tt-dlle Ala Pro Ala Val Arg Gly lie Ala Glu 
20 25 30 

Thr Met Gly Leu Asn Pro Gly Asp Val Thr Val Gly Gin Met val Thr 
35 40 45 

Gly Leu Arg Met Leu Gly Phe Asp Tyr Val Phe Asp Thr Leu Phe Glv 
50 55 60 

Ala Asp Leu Thr lie Met Glu Glu Gly Thr Glu Leu Leu His Arg Leu 
65 70 75 80 

Gin Asp His Leu Glu Gin His Pro Asn Lys Glu Glu Pro Leu Pro Met 
85 90 95 

Phe Thr Ser Cys Cys Pro Gly Trp val Ala Met Val Glu Lys ser Asn 
100 105 110 

Pro Glu Leu lie Pro Tyr Leu Ser Ser cys Lys ser Pro Gin Met Met 
115 120 125 

Leu Gly Ala val lie Lys Asn Tyr Phe Ala Ala Glu Ala Gly Ala Lys 
130 135 140 

Pro Glu Asp lie cys Asn val ser Val Met Pro cys Val Arg Lys ser 
145 150 155 160 

Gly Glu Ala Glu Pro Arg ser Gly ser Thr His His Arg Ala Gly Arq 
165 170 175 

Arg Asp Val Asp His Val Met Thr Thr Ala Glu Leu Gly Lys lie Phe 
180 185 190 

Val Glu Arg Gly lie Lys Leu Asn Glu Leu Gin Glu Ser Pro Phe Asp 
195 200 205 

Asn Pro val Gly Glu Gly Ser Gly Gly Gly Leu Leu Phe Gly Thr Thr 
210 215 220 

Gly Gly Val Met Glu Ala Ala Leu Arg Thr val Tyr Glu val val Thr 
225 230 235 240 

Ala Glu Ala Leu Gly Pro Gin Arg Ser ser Leu Thr Thr Ser Thr Ala 
245 ~ 250 255 

Trp Thr Pro Ala Gin Arg Ala Ser Pro Arg pro ser Pro Gin Ala Pro 
260 ' 265 " 270 

Thr Ala Pro Ser Arg Pro Leu Gin Ala Gin Thr Glu ser Gly lie Thr 
275 280 285 

Leu Asn lie Ala val Ala Asn Gly Leu Gly Asn Ala Lys Lys Leu lie 
290 295 300 

Lys Gin Leu Ala Ala Gly Glu ser Lys Tyr Asp Phe Thr Glu val Met 
305 310 315 320 
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Ala Cys Pro Gly Gly cys He Gly Gly Gly Gly Gin Pro Gin Arg Asn 
325 330 335 

Lys Gin lie Leu Gin Lys Arg Gin Ala Ala Met Tyr Asp Leu Asp Glu 
340 345 350 

Arg Ala Val lie Arg Arg Thr Glu Asn Pro Leu lie Gly Ala Leu Tyr 
355 360 365 

Glu Lys Phe Leu Gly Glu Pro Asn Gly His Lys Ala His Glu Leu Leu 
370 375 380 

His Thr His Tyr val Ala Gly Gly Val Pro Asp Arg Arg Ser Glu Gly 
385 390 395 400 

Glu Ala Trp 



<210> 28 
<211> 581 
<212> PRT 

<213> Thermoanaerobacter tengcongensi s strain MB4T 
<400> 28 

Met Asp Lys val Arg val Thr lie Asp Gly lie Thr val Glu val Pro 
15 10 15 

Ser Tyr Tyr Thr val Leu Glu Ala Ala Lys Glu Ala Gly lie Asp lie 
20 25 30 

Pro Thr Leu Cys Tyr Leu Lys Glu lie Asn Gin lie Gly Ala Cys Arg 
35 40 45 

lie Cys Leu Val Glu lie Glu Gly val Arg Asn Leu Gin Thr Ser Cys 
50 55 60 

Thr Tyr Pro Val Phe Asp Gly Met Lys Val Tyr Thr Asn Thr Pro Lys 
65 70 75 80 

lie Arg Glu Ala Arg Arg Leu Asn Leu Glu Leu lie Leu ser Asn His 
8 5 90 95 

Asp Arg Asn cys Leu Thr Cys val Arg ser Thr Asn cys Glu Leu Gin 
100 105 110 

Ala Leu Ala Lys Arg Leu Gly val Glu Glu lie Arg Phe Glu Gly Glu 
115 120 125 

Asn lie Lys Tyr Pro lie Asp Asp Ala Ser Pro Ala val val Arg Asp 
130 135 140 

Pro Asn Lys Cys Val Leu Cys Arg Arg cys Val Ala Val cys Ser Glu 
145 150 155 160 

val Gin Asn val Phe Ala lie Gly Met val Asn Arg Gly Phe Lys Thr 
165 170 175 
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Met Val Ala Pro ser Phe Gly Arg Ser Leu Lys Asp ser Pro cys lie 
180 185 190 

Ser Cys Gly Gin cys He Met Val Cys Pro val Gly Ala lie Tyr Glu 
195 200 205 

Lys Asp His Thr Lys Arg val Tyr Glu Ala Leu Ala Asp Asp Lys Lys 
210 215 220 

Tyr val Val Ala Gin Thr Ala Pro Ala val Arg val Ala Leu Gly Glu 
225 230 235 240 

Glu Phe Gly Met Pro Val Gly Thr lie val Thr Gly Lys Met Ala Ala 
245 250 255 

Ala Leu Arg Arg Met Gly Phe Asp Ala Val Phe Asp Thr Asn Phe Ala 
260 265 270 

Ala Asp Leu Thr lie Met Glu Glu Gly ser Glu Leu Leu Glu Arq lie 
275 280 285 

Lys His Gly Gly Lys Leu Pro Met lie Thr ser cys ser Pro Gly Trp 
290 295 300 

lie Ala Phe Cys Glu Lys Tyr Tyr Pro Glu Phe lie Asp Asn Leu Ser 
305 310 315 320 

Thr Cys Lys Ser Pro His Met Met Met Gly Ala Leu Val Lys Ser Tyr 
325 330 335 

Tyr Ala Glu Lys Lys Gly Leu Asp Pro Lys Asp lie Phe Val Val Ser 
340 345 350 

lie Met Pro Cys Thr Ala Lys Lys Leu Glu lie Glu Arg Glu Glu Met 
355 360 365 

lie Arg Asn Gly Met Lys Asp Val Asp Ala Val Leu Thr Thr Arg Glu 
370 375 380 

Leu Ala Arg Met lie Lys Glu Met Gly lie Asp Phe Val Asn Leu Lys 
385 390 395 400 

Asp Glu Glu Phe Asp Glu Pro Leu Gly Met Ser Thr Gly Ala Gly Ala 
405 410 415 

lie Phe Gly Ala Thr Gly Gly Val Met Glu Ala Ala Leu Arg Thr Val 
420 425 430 

Ala Glu lie Val Glu Gly Arg Asp lie Gly Lys lie Asp Phe Glu Glu 
435 440 445 

Val Arg Gly Leu Glu Gly val Arg Glu Ala Thr lie Thr lie Asp Gly 
450 455 460 

Met Asp lie Lys lie Ala lie Ala Asn Gly Thr Gly Asn Ala Lys Lys 
465 470 475 480 
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Leu Leu Asp Lys Val Lys Ala Gly Glu val Glu Tyr His Phe He Glu 
485 490 495 

val Met Gly cys Pro Gly Gly cys lie Met Gly Gly Gly Gin Pro lie 
500 505 510 

His Asn Pro Asn Glu Met Glu Glu Val Lys Lys Leu Arq Ala Lys Ala 
515 520 525 

lie Tyr Glu lie Asp Lys Asn Leu Pro lie Arg Lys Ser His Glu Asn 
530 535 540 

Pro Ala lie Lys Arg Leu Tyr Glu Glu Phe Leu Gly Tyr Pro Leu ser 
545 550 555 560 

Glu Lys ser His Glu Leu Leu His Thr His Tyr Ser Arg Lys Glu Leu 
565 570 575 

Tyr Pro Leu Val Lys 
580 

<210> 29 
<211> 636 
<212> PRT 

<213> Neocallimastix frontalis 
<400> 29 

Met Ser Met Leu Ser ser val Leu Asn Lys Ala val val Asn Pro Lys 
15 10 15 

Leu Thr Arg Ser Leu Ala Thr Ala Ala Ala Glu Lys Met val Asn lie 
20 25 30 

Ser lie Asn Gly Arg Lys Phe Gin Val Lys Pro Lys Thr Thr Val Leu 
35 40 45 

Glu Ala Ala Lys Ala Asn Gly Tyr Tyr lie Pro Thr Leu cys Tyr His 
50 55 60 

Gin Glu Leu Pro val Ala Gly Asn cys Arg Leu cys Leu val Tyr Ala 
65 70 75 80 

Lys Gly ser Trp Lys Pro Leu Thr Ala Cys Thr Thr Glu val Trp Glu 
85 90 95 

Gly Met Glu lie Glu Thr Asp ser Pro Ala Val lie Glu Thr val Arq 
100 105 110 

Ser ser Leu Ser Met Met Arg Glu Glu His Pro Asn Asp Cys Met Thr 
115 120 125 

Cys Gly Ser Asn Gly Asp cys Glu Phe Gin Asp Leu lie Tyr Arg Tyr 
130 135 140 

Gin lie Asp Ala Lys His Pro Val Arg ser Leu Leu Lys His Lys ser 
145 150 155 160 
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Lys Lys Thr Asn His ser lie Thr Glu Pro Cys Tyr ser Pro Phe Asp 
165 170 175 

Asn Thr Thr Phe Ser Val Ala Arg Asp Met Asn Lys Cys val Lys cys 
180 185 190 

Gly Arg Cys lie Arg Ala cys His His Phe Gin Asn lie Asn lie Leu 
195 200 205 

Gly Phe lie Asn Arg Ala Gly Tyr Glu Arg Val Gly Thr Pro Met Asp 
210 215 220 

Arg Pro Met Asn Phe Thr Lys Cys Val Glu cys Gly Gin Cys Ser Gin 
225 230 235 ' 240 

val cys Pro Val Gly Ala lie Thr Ala Arg Thr Glu val val Asp Val 
245 250 255 

Leu Arg His Leu Asp Thr Lys Arg Lys Val val val Cys Ser Thr Ala 
260 265 270 

Pro Ala lie Arg Val Ala Pro Ala Glu Glu Phe Ser Thr Glu Ala Asp 
275 280 285 

Phe Asp Phe Thr Gly Lys Met Val Ala Gly Leu Arg Lys Leu Gly Phe 
290 295 300 

Asp Tyr lie Phe Asp Thr Asn Phe Ser Ala Asp Leu Thr lie Met Glu 
305 310 315 320 

Glu Gly Thr Glu Leu lie Asp Arg Leu Asn Asn Gly Gly Lys Phe Pro 
325 330 335 

Met Phe Thr ser cys cys Pro Gly Trp lie Asn Met Val Glu Lys ser 
340 345 350 

Tyr Pro Glu Leu ser Asp Asn Leu Ser Ser Cys Lys Ser Pro Gin Gin 
355 360 365 

Met lie Gly Ala Val lie Lys Ser Tyr Phe Ala Lys Lys Leu Gly Leu 
370 375 380 

Ser Thr Glu Asp lie lie His Val Ser lie Met Pro Cys Thr Ala Lys 
385 390 395 400 

Lys Gly Glu Ala Arg Arg Pro Glu Phe val Gin Lys Gly Lys Asp Gly 
405 ~ 410 415 

Lys Asp Tyr Pro Asp lie Asp Tyr val lie Thr Thr Arg Glu Leu Leu 
420 425 430 

Thr Leu Leu Lys Leu Lys Lys lie Asn Pro Ala Glu Leu Pro Asp Asp 
435 440 445 

Lys Phe Asp Ser Pro Leu Gly lie Gly Ser Ser Ala Gly Asn Leu Phe 
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"450* 35$,. 460 

Gly Val Thr Gly Gly Val Met Glu Ala Ala lie Arg Thr Ala Gin Val 
465 470 475 480 

lie Thr Gly Val Glu Asn Pro lie Pro Leu Gly Glu Leu Lys Ala He 
485 490 495 

Arg Gly Leu Asp Gly lie Lys Ala Ala Asn val Pro Leu Lys Thr Lys 
500 505 510 

Asp Gly Lys Glu Val ser val Arg Ala Ala val val ser Gly Gly Ala 
515 520 525 

Asn lie Gin Lys Phe Leu Glu Lys lie Lys Asn Lys Glu Leu Glu Phe 
530 535 540 

Asp Phe lie Glu Met Met Met cys Pro Gly Gly cys lie Asn Gly Gly 
545 550 555 560 

Gly Gin Pro Lys Ser Ala Asp Pro Glu lie Val Ala Lys Lys Met Gin 
565 570 575 

Arg Met Tyr Thr Met Asp Asp Gin Ala Lys Leu Arg Leu cys His Glu 
580 585 590 

Asn Pro Glu lie lie Asp val Tyr Lys Asn Phe Leu Gly Glu Pro Asn 
595 600 605 

Ser His Leu Ala His Glu Leu Leu His Thr His Tyr Asn Asp Arq ser 
610 615 620 

Lys Thr lie His Asp Met Gly His His Glu Lys Lys 
625 630 635 

<210> 30 

<211> 555 

<212> PRT 

<213> Piromyces sp. E2 

<400> 30 

cys Leu val Asp val Lys Gly ser Trp Lys Pro Leu Thr Ala Cys Thr 
1 5 10 15 

Thr Glu val Trp Glu Gly Met Glu lie Glu Thr Asp Thr Pro Ala val 
20 25 30 

Arg Glu Thr Val Arg Ser Ser Leu Ala Met Met Arg Glu Glu His Pro 
35 40 45 

Asn Asp Cys Met Thr Cys Glu Ser Asn Gly Asn cys Glu Phe Gin Asp 
50 55 60 

Leu lie Tyr Arg Tyr Gin lie Asp Ala Gin His Pro Val Arg Thr Leu 
65 70 75 " 80 

Leu Arg Asn Lys Phe Lys Lys Thr Asn His Ser lie Thr Glu Pro cys 
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Tyr ser Pro Phe Asp Asp ser Thr Phe Ser lie Ser Arg Asp Met Asn 
100 105 110 

Lys cys Val Lys cys Gly Arg Cys Val Arg Ala Cys His His Phe Gin 
115 120 125 

Asn lie Asn lie Leu Gly Phe He Asn Arg Ala Gly Tyr Glu Arg val 
130 135 140 

Gly Thr Pro Met Asp Arg Pro Met Asn Phe Thr Lys Cys val Glu cys 
145 150 155 160 

Gly Gin cys Ser Gin Val Cys Pro val Gly Ala lie Thr Glu Arg Asn 
165 170 175 

Glu cys lie Glu Val Leu Arg His Leu Asp Thr Lys Arg Lys lie Val 
180 185 190 

val val ser Thr Ala Pro Ala lie Arg val Ala Leu Ala Glu Glu Phe 
195 200 205 

Asn Ala Glu Pro Asp Phe Asp Phe Thr Gly Lys Met Val Ala Gly Leu 
210 215 220 

Lys Lys Leu Gly Phe Asp Tyr lie Phe Asp Thr Asn Phe ser Ala Asp 
225 230 235 240 

Leu Thr lie Met Glu Glu Gly Thr Glu Leu lie Thr Arg Leu Asn Glu 
245 250 255 

Gly Gly Lys Phe Pro Met Phe Thr Ser cys Cys Pro Gly Trp lie Asn 
260 265 270 

Met Val Glu Lys Ser Tyr Pro Glu lie Arg Asp Asn Leu ser ser Cys 
275 280 285 

Lys ser Pro Gin Gin Met lie Gly Ala Val lie Lys Thr Tyr Phe Ala 
290 295 300 

Lys Lys lie Asn Ala Lys Pro Glu Asp lie lie His Val Ser val Met 
305 310 315 320 

Pro cys Thr Ala Lys Lys Gly Glu Ala Lys Arg Pro Glu Phe Lys Arg 
325 330 335 

Asp Gly Val Pro Asp lie Asp His Val He Thr Thr Arg Glu Leu lie 
340 345 350 

Thr Leu Leu Lys Leu Lys Arg lie Asn Pro Ser Glu Leu Lys Asn Glu 
355 360 365 

Lys Phe Asp ser Pro Leu Gly lie Gly Ser Ser Ala Gly Asn Leu Phe 
370 375 380 
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S«^^«^|P5T^I^WI I «Bt Glu Ala Ala Val Arg Thr Ala Gin lie 
385 390 395 400 

lie Thr Gly val Glu Asn Pro lie Pro Leu Gly Glu Leu Lys Ala lie 
405 410 415 

Arg Gly Leu Asp Gly lie Lys Ala Ala ser val Pro Leu Lys Thr Lys 
420 425 430 

Asp Gly Lys Asp val Asn Val Arg Ala Ala val Val ser Gly Gly Ala 
435 440 445 

Asn lie Gin Lys Phe Leu Glu Lys Leu Lys Lys Lys Glu Leu Glu Phe 
450 455 460 

Asp Phe Val Glu Met Met Met cys Pro Gly Gly cys lie Asn Gly Gly 
465 470 475 480 

Gly Gin Pro Lys ser Ala Asp Pro Lys Val Val Ala Lys Lys Met Glu 
485 490 495 

Arg Met Tyr Thr Met Asp Asp Gin Ala ser Leu Arg Leu ser His Glu 
500 505 510 

Asn Pro Glu lie Thr Gin lie Tyr Lys Glu Phe Leu Lys Glu Pro Asn 
515 520 525 

Gly His Leu Ser His Glu Leu Leu His Thr His Tyr Asn Asp Arg Ser 
530 535 540 

Lys Ala lie Gin Asp Met Ser Leu His Gin Lys 
545 550 555 

<210> 31 

<211> 389 

<212> PRT 

<213> Neocallimastix frontalis 

<400> 31 

Thr Glu Arg Asn Glu val lie Glu val Leu Arg Gin Leu Asp ser Lys 
1 5 10 15 

Arg Lys He Leu val Cys Ser Thr Ala Pro Ala lie Arg val Ala Leu 
20 25 30 

Ala Glu Glu Phe Asn Ala Asp Pro Asp Phe Asn Phe Thr Gly Lys Met 
35 40 45 

Val Ala Gly Leu Arg Lys Leu Gly Phe Asp Tyr lie Phe Asp Thr Asn 
50 55 60 

Phe Ser Ala Asp Leu Thr lie Met Glu Glu Gly Thr Glu Leu lie Asn 
65 70 75 80 

Arg Leu Asn Asn Gly Gly Lys Phe Pro Met Phe Thr Ser cys Cys Pro 
85 90 95 
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Vlf TrVTW*Sri ke&mWG$u Lys Ser Tyr Pro Glu Leu Arg Glu Asn 
100 105 HO 

Leu ser Thr Cys Lys Ser Pro Gin Gin Met lie Gly Ala Leu lie Lys 
115 120 125 

Ser Tyr Phe Ala Lys Lys Leu Gly val ser Thr Glu Asp He lie His 
130 135 140 

val Ser val Met Pro Cys Thr Ala Lys Lys Gly Glu Ala Lys Arq Pro 
145 150 155 160 

Glu Phe val Gin Lys Gly Lys Asp Gly Lys Asn Tyr Pro Asp lie Asp 
165 170 175 

Tyr val Leu Thr Thr Arg Glu Leu Leu Thr Leu Met Lys Leu Lys Lys 
180 185 190 

val Asn Pro Ala Glu Leu Ala Asp Asp Lys Leu Asp ser Pro Leu Gly 
195 200 205 

lie ser ser ser Ala Gly Asn Leu Phe Gly val Thr Gly Gly Val Met 
210 215 220 

Glu Ala Ala Val Arg Thr Ala Gin lie lie Thr Gly val Glu Asn Pro 
225 230 235 240 

lie Pro Leu Gly Glu Leu Lys Ala Val Arg Gly Leu Glu Gly lie Lys 
245 250 255 

Ala Ala Thr Val Pro Leu Lys Thr Lys Glu Gly Lys Asp lie Asn Val 
260 265 270 

Arg Ala Ala Val Val Ser Gly Gly Ala Asn lie Gin Lys Phe Leu Glu 
275 280 285 

Lys lie Lys Asn Lys Glu Val Glu Phe Asp Phe Val Glu Met Met Met 
290 295 300 

cys Pro Gly Gly cys lie Asn Gly Gly Gly Gin Pro Lys ser Ala Asp 
305 310 315 320 

Pro Lys He val Thr Lys Lys Met Gin Arg Met Tyr Thr Met Asp Glu 
325 330 335 

Gin Ala Thr Leu Arg Leu Ser His Glu Asn Glu Glu val Lys Gin lie 
340 345 350 

Tyr Lys Glu Phe Leu lie Glu Pro Asn Gly His Leu ser His Glu Leu 
355 360 365 

Leu His Thr His Tyr Asn Asp Arg Ser Lys Ala lie Gin Asp Met ser 
370 375 ~ 380 

Leu His Glu Lys Lys 
385 
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<210> 32 
<211> 458 
<212> PRT 

<213> Desulfovibrio desulfuricans 
<400> 32 

Met Asn Gly Gin Gin Asn val lie Arg He Asp Ser Asp lie cys Thr 
1 5 10 15 

Gly cys Gly Arg cys Lys Asp val cys Pro Val Gly Ala Val Glu Gly 
20 25 30 

val Gin Gly Thr Pro His Ser lie Arg Glu Asp Val Cys Val Leu Cys 
35 40 45 

Gly Gin cys val Gin Gin Cys Ser Ala Phe Ala Ser Phe Tyr Glu Gin 
50 55 60 

His Pro Ala cys lie Ala Glu Lys Lys Arg Glu Arg Gly Leu Phe val 
65 70 75 80 

ser Glu Ala Ala Pro Leu Phe Ala Ala Trp His Thr Gly Asp Ala Pro 
85 90 95 

Arg val Ala Gly Arg Leu Ala Glu Gly cys His Ser Met val Gin cys 
100 105 110 

Ala Pro Ala val Arg Ala Ala lie Gly Glu Glu Phe Gly Met Pro Ala 
115 120 125 

Gly Ala Leu Thr Pro Gly Arg Leu Ala Ala Ala Leu Arg Arg Leu Gly 
130 135 140 

Phe Asp Arg Val Tyr Asp Thr Asn Phe Ala Ala Asp Leu Thr lie Met 
145 150 155 160 

Glu Glu Gly ser Glu Leu Leu Gin Arg Met Glu Gly Ala Gly Pro Leu 
165 170 175 

Pro Met Phe Thr Ser Cys Cys Pro Ala Trp Val Arg Tyr Ala Glu Gin 
180 185 190 

Gin Phe Pro Asp Leu Leu Glu His Leu Ser Ser cys Lys Ser Pro Gin 
195 200 205 

Gin Met Ala Gly Ala Val Phe Lys Ser Tyr Gly Ala Gin Leu Asp Gly 
210 215 220 

Val Asp Pro Arg Gin Val Phe ser Val Ala Val Met Pro Cys Thr cys 
225 230 235 240 

Lys Lys Ala Glu Ala Gin Arg Pro Gly Met Glu His Asp Gly val Arg 
245 250 255 

Asp Val Asp Ala Val Leu Thr Thr Gly Glu Leu Ala Ala Met Leu Arq 
260 265 270 
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Gin Ala His lie Asp Phe Ala Ala Leu Pro Asp Glu Pro Phe Asp Arq 
275 280 285 

Pro Leu Gly ser Tyr Ser Gly Ala Gly Asn lie Phe Gly Leu Thr Gly 
290 295 300 

Gly val Met Glu Ala Ala Leu Arg Thr Ala Tyr Glu Leu Val Thr Gly 
305 310 315 320 

Glu Pro val Pro cys Thr Glu Leu Val Tyr val Arg Gly Gly Glu Gly 
325 330 ~ 335 

lie Arg His Ala Thr Leu Thr Met Asp Gly Arg Thr Phe Arg Val Ala 
340 345 350 

val Val Ala Gly Leu Gin His val Arg Pro Leu Leu Glu Ala Val Arg 
355 360 365 

Ala Gly Thr Cys Asp Val Asn Phe val Glu val Met Cys Cys Pro Gin 
370 375 380 

Gly cys lie ser Gly Gly Gly Gin Pro Lys Val Leu Leu Pro Phe Gin 
385 390 395 400 

Arg Asp Glu val Tyr Ala Ala Arg Lys Ala Ala Leu Tyr Arg His Asp 
405 410 415 

Ala Glu Leu Ala cys Arg Lys ser His Glu Asn pro Gin val Gin Ala 
420 425 430 

Leu Tyr Arg Glu Phe Leu Gly Glu Pro Leu ser His val ser His Asn 
435 440 445 

Leu Leu His Thr Val Tyr Gly Gin Thr Arq 
450 455 

<210> 33 
<211> 554 
<212> PRT 

<213> Desulfitobacterium hafniense 
<400> 33 

Met Met Gin Leu Lys His Pro Phe Gin ser Gly Phe Gin Gin Gin ser 
15 10 15 

cys Lys Arg His Thr Lys Lys Val Val Val Asp Met Glu ser Lys Ala 
20 25 30 

Gly Lys Gly ser Asn Leu ser Arg Arg ser Phe Leu Lys Phe Ala Gly 
35 40 45 

Gly Ala Gly lie Ala Gly Ala Ser Leu ser Leu Thr Gly Cys Gly Gin 
50 55 60 

Pro Leu Thr Pro Ala Ser Ala val Gly Gly Glu Gly Trp Met Pro Thr 
65 70 75 80 

Page 60 



WO 2005/072262 PCT/US2005/001983 

050118 CIP sequence Listing 

:G!Tn''W Trp pro Thr Asn Val Arg Gly Arg val 

85 90 95 

Pro He Asp Pro Glu Asn Pro Ala Leu Arg Arg Asp Asp Gin Lys Cys 
100 105 ^ 110 

lie Leu cys Gly Gin cys lie Glu val cys Lys Thr He Gin Ser Val 
115 120 125 

Tyr Gly Asn Tyr Glu Leu Pro Leu Lys Asn Glu lie Pro cys lie Asn 
130 135 140 

Cys Gly Gin cys lie His Trp cys Pro Ser Gly Ala lie Ser Glu Arg 
145 150 155 160 

Glu Asp lie Asp Gin val Ala Lys Ala Leu Ala Asp Pro Lys lie Thr 
165 170 175 

val Val Val Gin Thr Ala Pro Ala Thr Arg lie Gly Leu Gly Glu Glu 
180 185 190 

Phe Gly Leu Pro val Gly Thr Asn val Gin Gly Lys Gin val Ala Ala 
195 200 205 

Leu Arg Lys Leu Gly Phe Asp Val lie Phe Asp Thr Asn Phe Ala Ala 
210 215 220 

Asp Leu Thr He Met Glu Glu Gly Thr Glu Leu Val Lys Arg lie Thr 
225 230 235 240 

Gly Glu Leu His His Pro Leu Pro Gin Phe Thr Ser Cys Cys Pro Gly 
245 250 255 

Trp val Lys Phe Val Glu Tyr Tyr Tyr Pro Glu Leu Leu Pro Asn Leu 
260 265 270 

ser ser Ala Lys ser Pro Gin Gin Met Ala Gly Ala Leu Val Lys Thr 
275 280 285 

Tyr Phe Ala Glu Lys Asn His val Glu Pro Gin Lys lie Phe ser Val 
290 295 300 

Ala lie Met Pro cys Thr Ala Lys Lys Phe Glu cys Gin Arg Pro Glu 
305 310 315 320 

Met lie ser Ala Gin Thr Tyr Trp Gin Asp Glu Gin Val Ser Pro Asp 
325 330 335 

Val Asp val Val Leu Thr Thr Arg Glu Leu Ala Arg Met lie Lys Arq 
340 345 350 

Ala Gly lie Asp Leu Pro Ser Leu Pro Asp Glu Glu Tyr Asp Gin Leu 
355 360 365 

Met Gly val Ala Thr Gly Ala Gly Ala lie Phe Gly Thr Thr Gly Gly 
370 375 380 
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Val Met Glu Ala Ala Val Arg ser Ala Tyr Tyr Leu val Thr Gly Glu 
385 390 395 400 

Gin Pro Pro Ala Ala Leu Trp Gin Leu Thr Pro val Arg Gly Met Glu 
405 410 415 

Gly val Lys Glu Ala Ala Val Ser lie Pro Gly Ala Gly Glu lie Arq 
420 425 430 

lie Ala val lie Ser Gly Leu Asp Asn Ala Arg Ala lie Met Glu Gin 
435 440 445 

val Lys Ala Gly Asn ser Pro Trp Thr Phe lie Glu Val Met Ala Cys 
450 455 460 

Pro Gly Gly Cys Gin Tyr Gly Gly Gly Gin Pro Arg Ser Ser Ala Pro 
465 470 475 480 

Pro ser Asp Gly val Arg Asn Thr Arg Ala Ala Ser Leu Tyr Lys lie 
485 490 495 

Asp Ala Gin Ala Lys Leu Arg Asn Ser His Asp Asn Pro Gin lie Lys 
500 505 510 

Gin val Tyr Ala Glu Phe Leu Thr ser Pro Leu ser Glu Lys Ala Glu 
515 520 525 

Glu Leu Leu His Thr His Tyr He Ser Arg Ala Glu Glu Phe Asp Ala 
530 535 540 

Lys Lys Pro Gin Ser His Glu Tyr Glu val 
545 550 

<210> 34 

<211> 578 

<212> PRT 

<213> Eubacterium acidaminophilum 

<400> 34 

Met val Asn lie Thr lie Asp Gly Arg Gin Val Thr val Pro Ala Asn 
1 5 10 15 

Ser Thr val Leu Asp Ala Ala Arg Asp Met Gly lie Asn lie Pro Thr 
20 25 30 

Leu Cys Tyr Leu Lys Asp lie Asn Lys Thr Gly Ala Cys Arg Met cys 
35 40 45 

Leu Val Glu val Glu Gly lie Arg Asn Leu Gin Thr Ala cys Thr Phe 
50 55 60 

Pro Val Arg Asp Gly Leu Val Val Lys Thr Asn Thr Lys Arg Val Arg 
65 70 75 80 

Asp Ala Arg Arg Asp Asn Leu Gin Leu lie Leu Ser Asn His His Arq 
85 90 95 
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Asp cys Leu ser cys Phe Arg Asn Gly ser Cys Glu Leu Gin Ala Leu 
100 105 110 

Cys Asp Asp Met Gly Leu Ser Glu Leu Asp Phe Glu Ala Pro Lys Glu 
115 120 125 

Leu Lys Pro val Asp Met Leu ser His ser lie Val Arg Asp pro Asn 
130 135 140 

Lys cys lie Leu Cys Gly Arg Cys Val Ala val cys Asn Lys Val Gin 
145 150 155 160 

Glu val Gly lie Leu Ala Phe Thr Asn Arg Gly val Glu Thr Glu val 
165 170 175 

Ala Pro Ala Phe Ala Thr ser Met Ala Asp Ala Pro Cys lie Tyr cys 
180 185 190 

Gly Gin cys val Asn Val cys Pro Val Ala Ala Leu Arg Glu Lys Thr 
195 200 205 

Asp lie Glu Lys val Trp Glu Val Leu Glu Asp Glu Thr Lys His val 
210 215 220 

Val val Gin Val Ala Pro Ala val Arg Ala Ala Leu Gly Glu Met Phe 
225 230 235 240 

Gly Asn Pro lie Gly Thr Arg Val Thr Gly Lys Met Phe Thr Ala Leu 
245 250 255 

Lys Met Leu Gly Phe Gin Lys val Phe Asp Thr Asn Phe Ala Ala Asp 
260 265 270 

Leu Thr lie Met Glu Glu Gly Thr Glu Leu Leu Gly Arg lie Lys Asn 
275 280 285 

Gly Gly Thr Leu Pro Met lie Thr ser Cys Ser Pro Gly Trp lie Arq 
290 295 300 

Tyr Val Glu His Phe Tyr Pro Glu Leu Leu Asp His val Ser ser cys 
305 310 315 320 

Lys ser pro Gin Gin Met Met Gly Ala val Leu Lys ser Tyr Tyr Ala 
325 330 335 

Glu Lys Asn Asn lie Ala Pro Glu Asn Met lie Val val ser val Met 
340 345 350 

Pro Cys lie Ala Lys Lys Thr Glu ser Ala Lys Glu Glu Met Lys Asn 
355 360 365 

Val His Gly Thr Arg Asp Val Asp lie Val Leu Thr Thr Arg Glu Leu 
370 375 380 

Gly Lys Met lie Lys Glu Ala Arg lie Glu Phe Asn Asp Leu Gin Asp 
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ser Asn Pro Asp Glu Phe Phe Gly Asp Tyr Thr Gly Ala Ala val lie 
405 410 415 

Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala lie Arg Thr val Ala 
420 425 430 

Asp lie val ser Gly Gin Glu Leu Glu Asp lie Glu Tyr Thr Ala Val 
435 440 445 

Arg Gly Leu Glu Gly lie Lys Glu Ala Ala Val Lys lie Gly Asp Leu 
450 455 460 

Glu Val Lys val Ala Val Ala His Gly Thr Ala Asn Ala Gly Lys Leu 
465 470 475 480 

Met Asp Leu val Arg Asp Gly Lys Ala Asp Tyr His Phe lie Glu lie 
485 490 495 

Met Gly cys Ser Gly Gly cys val Thr Gly Gly Gly Gin Pro His val 
500 505 510 

Asp ser Arg Thr Lys Glu Lys Val Asn val Lys Leu Glu Arg Ala Lys 
515 520 525 

Ala Leu Tyr Thr Glu Asp Lys Leu Arg Asp Lys Arg Lys Ser His His 
530 535 540 

Asn Glu ser val Lys Arg Leu Tyr Glu Glu Tyr Leu Gly Lys Pro Asn 
545 550 555 560 

Gly His Lys Ala His Glu Leu Leu His Thr His Tyr Lys Lys Arg Glu 
565 570 575 

Leu Phe 



<210> 35 
<211> 619 
<212> PRT 

<213> Rhodopseudomonas palustris 
<400> 35 

Met cys Thr Pro Asp Gin Ala Ser Leu ser Ala Arg Asp Pro Ala Glu 
15 10 15 

Ala Thr lie Thr Leu Ser lie Asn Gly Val Ala cys Ala Gly Phe Ala 
20 25 30 

Asn Glu Thr lie Leu Ser Cys Ala Arg Arg Tyr Asp val Tyr lie Pro 
35 40 ~ 45 

Thr Leu cys Glu Leu Glu Asp lie Asp His Thr Pro Gly Ala Cys Arg 
50 55 60 

Val cys Leu Val Glu lie Leu Gin Ala Gly Lys Asp Thr Pro Gin lie 
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Val Thr Ala Cys Asn Thr Pro Val Arg Asp Gly Met Glu val Gin Thr 
85 90 95 



Arg Ser Lys Lys Ala Arg Asp Met Gin Arg Leu Gin val Glu Leu Leu 
100 105 110 



Met Ala Asp His Leu Gin Asp Cys Ala Thr cys lie Arg His Gly Ser 
115 120 125 



cys Glu Leu Gin Asp Leu Ala Gin Phe val Gly Leu Gin Gin Asn Arg 
130 135 140 



Phe Phe Asp Arg Glu Arg Thr Glu Ala Arg Pro val Asp His Ser ser 
145 150 155 160 



Pro ser Met Val Arg Asp Met Arg Arg cys Val Arg Cys Gin Arg Cys 
165 170 175 



val Ala lie Cys Arg Tyr His Gin Lys lie Asp Ala Leu Ala lie Glu 
180 185 190 



Gly ser Gly Leu Glu Arg Met Val Ala Leu Arg Asp Ala Asp Gly Tyr 
195 200 205 



Pro Asn ser Val Cys val Ser cys Gly Gin cys Val Leu val cys Pro 
210 215 220 



Thr Gly Ala Leu Gly Glu Arg Asp Glu Thr Asp Arg Ala Leu Asp Tyr 
225 230 235 240 



lie cys Asp Pro Asn val val Thr val Val Gin Phe Ala Pro Ala val 
245 250 255 



Arg val Ala Phe Gly Glu Glu Phe Gly Leu Pro Ala Gly Thr Asn val 

260 265 270 



Glu Gly Gin lie lie Ala Ala Cys Arg Lys Leu Gly Val Asp Val val 
275 280 285 



Leu Asp Thr Asn Phe Ala Ala Asp Val Val lie Met Glu Glu Gly Ala 
290 295 300 



Glu Leu Leu Ala Arg Leu Lys Gin Gly Arg Arg Pro Thr Phe Thr Ser 
305 310 315 320 



Cys cys Pro Ala Trp lie Asn Phe Ala Glu lie His Tyr Pro Asp Val 
325 330 335 



Leu Pro Leu Leu Ser ser Thr Lys Ser Pro Gin Gin val Leu ser Thr 
340 345 350 



lie Ala Lys Ser Tyr Leu Pro Ala Gin Leu Gly val Pro Ala Glu Arg 



355 



360 



365 
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■^/♦5fe¥»^^CI^OB!QW et Pro Q y s 11 e Ala Lys Lys Asp Glu Ala 
■ S"fQ" v 375 380 

Val Arg Pro Gin Met Val His Asp Gly Gin Pro Glu Thr Asp Leu Val 
385 390 395 400 

Leu Thr Thr Arg Glu Phe Ala Arg Leu Leu Arg Arg Glu Gly lie Asp 
405 ~ 410 415 

Leu Lys Asp Leu Pro ser Ser Gin Phe Asp Arg Pro Phe Leu ser Ala 
420 425 ~ 430 

Tyr ser Gly Ala Gly Ala lie Phe Gly Thr Thr Gly Gly val Met Glu 
435 440 445 

Ala Ala val Arg Thr lie Tyr Ala Leu val Asn Gly Arg Glu Leu Glu 
450 455 460 

Arg lie Glu Leu Thr Gin Leu Arg Gly Phe Glu Gly Leu Arg Glu Ala 
465 470 475 480 

Thr Val Asp Leu Gly Ala Pro Val Gly Glu Val Lys val Ala Met Val 
485 490 495 

His Gly Leu Gly Asp Thr Arg Lys Leu Val Glu ser val Leu ser Gly 
500 505 510 

Glu Ala Asn Tyr Asp Phe lie Glu Val Met Ala Cys Pro Gly Gly cys 
515 520 525 

Val Asp Gly Gly Gly Ser Leu Arg Ser Lys Lys Ala Tyr Leu Pro Leu 
530 535 540 

Ala Leu Lys Arg Arg Glu Thr lie Tyr Asn val Asp Arg Ala Ala Lys 
545 550 555 560 

val Arg Gin ser His Asn Asn Pro Gin Val Gin Ala Leu Tyr Arg Glu 
565 570 575 

Leu Leu Gin Ala Pro Asn Ser Glu lie Ala His Arg Leu Leu His Thr 
580 585 590 

His Tyr Ala ser Arg Lys Arg Glu Leu Gin His Thr val Lys Glu lie 
595 600 605 

Trp Asp Asp Leu Thr Met ser Thr lie Leu Tyr 
610 615 

<210> 36 

<211> 644 

<212> PRT 

<213> Clostridium thermocellum 

<400> 36 

Met Asp ser Phe Leu Met Lys Gly Tyr lie Lys Glu Ala Asn lie Asp 
1 5 10 15 
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Ttyr ym^m^ t 0rt}^MM^B&er Met Glu Asp Leu Pro Lys Trp Glu Phe 
" " n, ,r 20 25 30 

Arg Glu lie Pro Lys Val Pro Arg Ala Val Met Pro ser Leu Ser Leu 
35 40 45 

Glu Glu Arg Lys Asn Asn Phe Asn Glu Val Glu Leu Gly Leu ser Glu 
50 55 60 

Glu Val Ala Arg Lys Glu Ala Arg Arg cys Leu Lys Cys Gly cys Ser 
65 70 ~ 75 80 

Ala Arg Phe Thr cys Asp Leu Arg Lys Glu Ala Ser Asn His Gly lie 
85 90 95 

Val Tyr Glu Glu Pro lie His Asp Arg Pro Tyr lie Pro Lys Val Asp 
100 105 110 

Asp His Pro Phe lie Val Arg Asp His Asn Lys cys lie ser cys Gly 
115 120 125 

Arg cys lie Ala Ala Cys Ala Glu lie Glu Gly Pro Gly val Leu Thr 
130 135 140 

Phe Tyr Met Lys Asn Gly Arg Gin Leu Val Gly Thr Lys ser Gly Leu 
145 150 ~ 155 160 

Pro Leu Arg Asp Thr Asp cys Val ser Cys Gly Gin Cys Val Thr Ala 
165 170 175 

cys Pro cys Ala Ala Leu Asp Tyr Arg Arg Glu Arg Gly Lys Val Val 
180 185 ~ 190 

Arg Ala lie Asn Asp Pro Lys Lys Thr val val Gly Phe Val Ala Pro 
195 200 205 

Ala Val Arg Ser Leu lie ser Asn Thr Phe Gly Val ser Tyr Glu Glu 
210 215 220 

Ala ser Pro Phe Met Ala Gly Leu Leu Lys Lys Leu Gly Phe Asp Lys 
225 230 235 240 

Val Phe Asp Phe Thr Phe Ala Ala Asp Leu Thr lie Val Glu Glu Thr 
245 250 255 

Thr Glu Phe Leu ser Arg lie Gin Asn Lys Gly val Met Pro Gin Phe 
260 ^ 265 270 

Thr ser cys cys Pro Gly Trp lie Asn Phe Val Glu Lys Arg Tyr Pro 
275 280 285 

Glu lie lie Pro His Leu ser Thr Cys Lys ser Pro Gin Met Met Met 
290 295 300 

Gly Ala Thr Val Lys Asn His Tyr Ala Lys Leu Met Gly lie Asn Lys 
305 310 315 320 
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- -r /■ is it e mi s m •$ S R 

"Grid A'Sp''Teff ,5 '£he VaT VaT" sen He val Pro Cys Leu Ala Lys Lys Tyr 
325 330 335 

Glu Ala Ala Arg Pro Glu Phe lie His Asp Gly lie Arg Asp Val Asp 
340 345 350 

Ala val Leu Thr Thr Thr Glu Met Leu Glu Met Met Glu Leu Ala Asp 
355 360 365 

lie Lys Pro Ser Glu Val Val Pro Gin Glu Phe Asp Glu Pro Tyr Lys 
370 375 380 

Gin val Ser Gly Ala Gly lie Leu Phe Gly Ala ser Gly Gly Val Ala 
385 390 395 400 

Glu Ala Ala Leu Arg Met Ala val Glu Lys Leu Thr Gly Lys val Leu 
405 410 415 

Thr Asp His Leu Glu Phe Glu Glu lie Arg Gly Phe Glu Gly val Lys 
420 425 430 

Glu Ser Thr lie Asp val Asn Gly Thr Lys val Arg val Ala Val Val 
435 440 445 

Ser Gly Leu Lys Asn Ala Glu Pro lie lie Glu Lys lie Leu Asn Gly 
450 455 460 

Val Asp Val Gly Tyr Asp Leu lie Glu Val Met Ala Cys Pro Gly Gly 
465 470 475 480 

Cys lie cys Gly Ala Gly His Pro Val Pro Glu Lys lie Asp ser Leu 
485 490 495 

Glu Lys Arg Gin Gin val Leu Val Asn lie Asp Lys val ser Lys Tyr 
500 505 510 

Arg Lys Ser Gin Glu Asn Pro Asp lie Leu Arg Leu Tyr Asn Glu Phe 
515 520 ~ 525 

Tyr Gly Glu Pro Asn Ser Pro Leu Ala His Glu Leu Leu His Thr His 
530 535 540 

Tyr Thr Pro Lys His Gly Asp Ser Thr cys ser Pro Glu Arg Lys Lys 
545 550 555 560 

Gly Thr Ala Ala Phe Asp val Gin Glu Phe Thr lie cys Met Cys Glu 
565 570 575 

Ser cys Met Glu Lys Gly Ala Glu Asn Leu Tyr Asn Asp Leu Ser Ser 
580 585 590 

Lys lie Arg Leu Phe Lys Met Asp Pro Phe Val Gin lie Lys Arg lie 
595 600 605 

Arg Leu Lys Glu Thr His Pro Gly Lys Gly val Tyr lie Ala Leu Asn 
610 615 620 
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Gly Lys Gin lie Glu Glu Pro Met Leu 
625 630 



Ser 



Gly Asn 
635 



lie Pro 



Asp 



Glu 
640 



Ser Glu Ser Glu 



<210> 37 

<211> 572 

<212> PRT 

<213> Clostridium perfringens 

<400> 37 

Met Asn Lys lie lie lie Asn Asp Lys Thr lie Glu Phe Asp Gly Asp 
1 5 10 15 



Lys Thr lie Leu Asp Leu Ala Arg Glu Asn Gly Phe Asp lie Pro val 
20 25 30 



Leu cys Glu Leu Lys Asn Cys Gly Asn Lys Gly Gin cys Gly Val Cys 
35 40 45 



Leu val Glu Gin Glu Gly Asn Asp Arg Leu Leu Arg Ser Cys Ala lie 
50 55 60 



Lys Ala Lys Asp Gly Met val lie Lys Thr Asp Ser Glu Lys Val Leu 
65 70 75 80 



Glu Ala Arg Lys Glu Arg Val Ala Glu Leu Leu Asp Glu His Glu Phe 
85 90 95 



Lys cys Gly Pro Cys Lys Arg Arg Glu Asn Cys Glu Phe Leu Lys Leu 
100 105 110 



Val lie Lys Thr Lys Ala Arg Ala His Lys Pro Phe Val val Ala 
115 120 125 



Lys Ser Glu Tyr Val Asp Asp Arg Ser Lys Ser lie Val Leu Asp Arg 
130 135 140 



Ser Lys cys Val Lys Cys Gly Arg cys Val Ala Ala Cys Arg Thr Arg 
145 150 155 ~ 160 



Thr Ala Thr Asn Ser lie Lys Phe His Arg lie Asp Gly Val Arg Leu 
165 170 175 



Val Gly Pro Glu Glu Leu Lys Cys Phe Asp Asp Thr Asn cys Leu Leu 
180 185 190 



Cys Gly Gin Cys lie Ala Ala cys Pro Val Asp Ala Leu ser Glu Lys 
195 200 205 



Ser His lie Glu Arg val Gin Glu Ala Leu Asn Asp Pro Glu Lys His 
210 ~ 215 220 



Val lie Val Ala Met Ala Pro Ala Val Arg Thr ser Met Gly Glu Leu 



225 



230 



235 
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Phe Lys Met Gly Tyr Gly Gin Asp val Thr Gly Lys Leu Tyr Thr Ala 
245 250 255 

Leu Arg Glu Leu Gly Phe Asp Lys val Phe Asp He Asn Phe Gly Ala 
260 265 270 

Asp Met Thr lie Met Glu Glu Ala Thr Glu Leu lie Glu Arq lie Lvs 
275 280 285 

Asn Asn Gly Pro Phe Pro Met Leu Thr Ser Cys Cys Pro Ser Trp val 
290 295 300 

Arg Glu Val Glu Asn Tyr Phe Pro Glu Leu val Glu Asn Leu Ser Ser 
305 310 315 320 

Ala Lys Ser Pro Gin Gin lie Phe Gly Ala Ala Ser Lys Thr Tyr Tyr 
325 330 335 

Pro Gin val Ala Asp lie Asp Pro Lys Lys Val Phe Thr Val Thr Val 
340 345 350 

Met Pro cys Thr Ser Lys Lys Phe Glu Ala Asp Arg Pro Glu Met Glu 
355 360 365 

Asn Glu Gly lie Arg Asn He Asp Ala Val lie Thr Thr Arg Glu Leu 
370 375 380 

Ala Arg Met lie Lys Ala Ala Lys lie Asp Phe Ala Lys Leu Glu Asp 
385 390 395 400 

Gly Glu val Asp Pro Ala Met Gly Glu Tyr Thr Gly Ala Gly Val lie 
405 410 415 

Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala Leu Arg Thr Ala Lys 
420 425 430 

Asp Phe Met Glu Asn Asp Asn Leu Asp Asn Val Asp Tyr Glu Ala val 
435 440 445 

Arg Gly Leu Ala Gly lie Lys Glu Ala Glu Val Glu lie Ala Gly Asn 
450 455 460 

Glu Tyr Lys Leu Ala Val Val Ser Gly Ala Ala Asn Val Phe Glu Leu 
465 470 475 480 

Val Lys ser Gly Lys lie Asn Asp Tyr His Phe lie Glu Val Met Ala 
485 490 495 

Cys Pro Gly Gly cys val Asn Gly Gly Gly Gin Pro His lie ser Ala 
500 505 510 

Glu Asp Ser Asp Lys Met Asp lie Arg Glu Val Arg Ala Ser Val Leu 
515 520 ~ 525 

Tyr Asn Gin Asp Lys Asn Leu Glu Lys Arg Lys Ser His Gin Asn Ser 
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/- Wfc m s n »n o o ^535 540 

Ala Leu Leu Lys Met Tyr Glu ser Tyr Met Gly Lys Pro Gly His Gly 
545 550 555 560 

Arg Ala His Glu Leu Leu His Met Lys Tyr Lys Lys 
565 570 

<210> 38 
<211> 583 
<212> prt 

<213> Clostridium thermocellum 
<400> 38 

Met His val Leu Lys Leu Val His Ser Thr Gin Tyr Trp Arg Ala Glu 
1 5 10 15 

Glu Met Asp Asn Arg Glu Tyr Met Leu lie Asp Gly lie Pro Val Glu 
20 25 30 

lie Asn Gly Glu Lys Asn Leu Leu Glu Leu lie Arg Lys Ala Gly lie 
35 40 45 

Lys Leu Pro Thr Phe cys Tyr His Ser Glu Leu ser val Tyr Gly Ala 
50 55 60 

cys Arg Met Cys Met Val Glu Asn Glu Trp Gly Gly Leu Asp Ala Ala 
65 70 75 80 

Cys ser Thr Pro Pro Arg Ala Gly Met Ser lie Lys Thr Asn Thr Glu 
85 - 90 95 

Arg Leu Gin Lys Tyr Arg Lys Met lie Leu Glu Leu Leu Leu Ala Asn 
100 105 110 

His cys Arg Asp cys Thr Thr Cys Asn Asn Asn Gly Lys cys Lys Leu 
115 120 125 

Gin Asp Leu Ala Met Arg Tyr Asn lie Ser His lie Arg Phe Pro Asn 
130 135 140 

Thr Ala Ser Asn Pro Asp val Asp Asp ser Ser Leu Cys lie Thr Arq 
145 150 155 160 

Asp Arg ser Lys Cys lie Leu cys Gly Asp cys val Arg Val cys Asn 
165 170 175 

Glu val Gin Asn Val Gly Ala lie Asp Phe Ala Tyr Arg Gly ser Lys 
180 185 190 

Met Thr lie ser Thr Val Phe Asp Lys Pro lie Phe Glu Ser Asn cys 
195 200 205 

Val Gly cys Gly Gin cys Ala Leu Ala cys Pro Thr Gly Ala lie Val 
210 215 220 

Val Lys Asp Asp Thr Gin Lys Val Trp Lys Glu He Tyr Asp Lys Asn 
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Thr Arg val Ser val Gin He Ala Pro Ala val Arg val Ala Leu Gly 
245 250 .255 

Lys Glu Leu Gly Leu Asn Asp Gly Glu Asn Ala lie Gly Lys lie val 
260 265 270 

Ala Ala Leu Arg Arg Met Gly Phe Asp Asp lie Phe Asp Thr ser Thr 
275 280 285 

Gly Ala Asp Leu Thr val Leu Glu Glu Ser Ala Glu Leu Leu Arq Arq 
290 295 300 

lie Arg Glu Gly Lys Asn Asp Met Pro Leu Phe Thr Ser Cys cvs Pro 
305 310 315 320 

Ala Trp val Asn Tyr Cys Glu Lys Phe Tyr Pro Glu Leu Leu Pro His 
325 330 335 

val Ser Thr Cys Arg ser Pro Met Gin Met Phe Ala Ser lie lie Lys 
340 345 350 

Glu Glu Tyr ser Thr Ser Ser Lys Arg Leu Val His val Ala val Met 
355 360 365 

Pro cys Thr Ala Lys Lys Phe Glu Ala Ala Arg Lys Glu Phe Lys val 
370 375 380 

Asn Gly val Pro Asn Val Asp Tyr Val Leu Thr Thr Gin Glu Leu val 
385 390 395 400 

Arg Met lie Lys Glu ser Gly He val Phe ser Glu Leu Glu Pro Glu 
405 410 415 

Ala lie Asp Met Pro Phe Gly Thr Tyr Thr Gly Ala Gly val lie Phe 
42 0 425 430 

Gly val ser Gly Gly val Thr Glu Ala val Leu Arg Arg Val Val ser 
435 440 445 

Asp Lys ser Pro Thr ser Phe Arg ser Leu Ala Tyr Thr Gly val Arq 
450 455 460 

Gly Met Asn Gly val Lys Glu Ala Ser val Met Tyr Gly asp Arq lvs 
465 470 475 480 

Leu Lys val Ala Val val ser Gly Leu Lys Asn Ala Gly Asp Leu lie 
485 490 K 495 

Glu Arg lie Lys Ala Gly Glu His Tyr Asp Leu Val Glu Val Met Ala 
500 505 510 

cys Pro Gly Gly cys He Asn Gly Gly Gly Gin Pro Phe val Gin ser 
515 520 525 
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;;GlV^ Lys Gly Leu Tyr Ser Ala Asp Lys Leu 

530 535 540 

Cys Asn lie Lys ser Ser Glu Glu Asn Pro Leu Met Met Thr Leu Tyr 
545 550 555 560 

Lys Gly lie Leu Lys Gly Arg val His Glu Leu Leu His Val Asp Tyr 
565 570 575 

Ala ser Lys Lys Glu Ala Lys 
580 

<210> 39 
<211> 439 
<212> PRT 

<213> Desulfovibrio desulfuri cans 
<400> 39 

Met Ala Gly Cys Lys Ala Gin His Pro Pro Ala Ala Tyr Leu Ala Gly 
1 5 10 15 

Leu Glu val Pro Ala Ala Gly Ser Glu val Thr Met Glu Gly Val Arg 
20 25 30 

Tyr Lys Met Asn Ala Pro Lys Asp Val Asp Pro Ala Thr lie Arg Phe 
35 40 45 

val Glu val Asp His Asp Lys cys Met Ala cys Gly Glu Cys Glu Tyr 
50 55 60 

His cys Pro Thr Gly Val Met Gin Glu Val Thr Glu Asp Gly Tyr Arg 
65 70 75 80 

Gly val val Asp Pro val Ala Cys Val Asn Cys Gly Gin cys Leu Ala 
85 90 95 

Asn cys Pro Phe Gly Ala lie His Glu Glu val Ser Phe val Gly Glu 
100 105 110 

Leu Tyr Glu Lys Leu Lys Asp Pro Asp Thr val val Val Ser Met Pro 
115 120 125 

Ala Pro Ala val Arg Tyr Ala Leu Gly Glu Cys Phe Gly Leu Pro Thr 
130 135 140 

Gly Thr Tyr Val Gly Gly Gin Met His Ala Ala Leu Arg Arg Leu Gly 
145 150 155 160 

Phe Asn Leu Val Trp Asp Thr Glu Trp Thr Ala Asp Val Thr lie Met 
165 170 175 

Glu Glu Gly Thr Glu Leu Leu Glu Arg val Lys His Gly Asn Met Pro 
180 185 190 

Leu Pro Gin Phe Thr ser cys Cys Pro Gly Trp lie Lys Phe Ala Glu 
195 200 205 
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, 1}r]1 ™ 050118 CIP Sequence Listing 
:ii'Wti»««Gl u Lys His Leu Ser Thr cys Lys ser Pro 
215 220 



lie Ala Met lie Gly Pro Leu Ala Lys Thr Tyr Gly Ala Gin Glu Ala 
225 230 235 240 



Gly val Pro Ala Lys Lys Met Tyr Thr Val Ser lie Met Pro cys lie 
245 250 255 



Ala Lys Lys Phe Glu Gly Met Arg Pro Glu Met Asn Ala Ser Gl 
260 265 270 



Arg Asp He Asp Ala Thr lie Thr Thr Arg Glu Leu Ala Trp Met He 
275 280 285 



Lys Lys Ala Gly lie Asp Phe Thr Ser Leu Pro Ser Glu Glu Pro 
290 295 300 



Pro Ala Leu Gly Met Ser Thr Gly Ala Ala Thr lie Phe Cys Thr Ser 
305 310 315 320 



Gly Gly val Met Glu Ala Ala Leu Arg Leu Ala Tyr Glu Ala Leu Ser 
325 330 335 



Gly Gly Thr Leu Ala Asp Pro Asp lie Lys Val val Arg Thr His Glu 
340 345 350 



Gly lie Asn Thr Ala Glu val Pro Val Pro Asn Phe Gly Thr Val 
355 360 365 



val Ala val Ala ser Gly Leu Asp Asn Ala Ala Lys Leu cys Glu Glu 
370 375 380 



val Arg Ala Gly Lys ser Pro Tyr His Phe lie Glu Val Met Thr Cys 
385 390 395 400 



Pro Gly Gly cys val Asn Gly Gly Gly Gin Pro Leu Glu Pro Gly Met 
405 410 415 



Leu Gin ser Ser Leu Phe Lys Ser Thr lie Thr Lys lie Asn Arq Arq 
420 425 430 



Phe Thr Arg Arg Ser Val Ala 



<210> 40 

<211> 379 

<212> PRT 

<213> Desulfovibrio desulfuricans 

<400> 40 

Met Asn Leu Val Glu Met Glu Lys lie Gin Tyr Val Asp Gin Ser Pro 
1 5 10 15 



Asp Pro Arg Ala Asn Pro Asp Glu Leu Phe Phe lie Gin lie Asp Pro 



435 



20 



25 



30 
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fuiOii»ny!iS]p^Asp Thr Cys Gin Glu Tyr cys Pro Thr Gly 
35 40 45 

Ala lie Phe Gly Asp Thr Gly Ser Ala His Ser lie Pro His Glu Glu 
50 55 60 

lie cys lie Asn cys Gly Gin cys Leu Thr His cys Pro val Gly Ala 
65 70 " 75 80 

lie Tyr Glu Val Gin ser Trp val Arg Glu Leu ser Glu Lys lie Lys 
85 90 95 

Asp Pro Glu lie Lys val lie Ala Met Pro Ala Pro Ala Val Arq Tyr 
100 105 110 

Gly Leu Gly Glu Cys Phe Gly Met Pro Val Gly Thr Val Thr Thr Gly 
115 120 125 

Lys Met Leu Thr Ala Leu Gin Met Leu Gly Phe Asp His val Trp Asp 
130 135 140 

Asn Glu Phe Thr Ala Asp val Thr lie Trp Glu Glu Gly Thr Glu Phe 
145 150 155 160 

Val Lys Arg Leu Thr Gly Gin lie Asp Lys Pro Leu Pro Gin Phe Thr 
165 170 175 

Ser cys cys Pro Gly Trp His Lys Tyr val Glu Ser Phe Tyr Pro Glu 
180 185 190 

Leu Phe Pro His Leu ser ser Cys Lys ser Pro lie Gly Met Met Gly 
195 200 205 

Ala Leu Ala Lys Thr Tyr Gly Pro Asp Val Met Lys Tyr Asp Arq ser 
210 215 220 

Lys Val Tyr Thr val Ser lie Met Pro Cys Thr Ala Lys Lys Tyr Glu 
225 230 235 240 

Gly Met Arg Ala Asp Leu Trp Ser ser Gly Tyr Lys Asp lie Asp Ala 
245 250 255 

Thr lie Asp Thr Arg Glu Leu Ala Tyr Met lie Lys Lys Ala Gly lie 
260 265 270 

Asp Phe Ala Ala Leu Pro Asp Gly Lys Arg Asp Thr Leu Met Gly Asp 
275 280 285 

Ser Thr Gly Gly Ala Thr lie Phe Gly Val Ser Gly Gly Val Met Glu 
290 295 300 

Ala Ala Leu Arg Tyr Ala Tyr Glu Ala val Thr Gly Lys Lys Pro ser 
305 310 315 320 

ser Trp Asp Phe Thr Met Val Arg Gly Leu Asn Gly lie Lys Glu Gly 
325 330 335 
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Thr Val Thr lie Gly Asp Ala l_ys lie Asn Val Ala val val His Gly 
340 345 350 

Ala Lys Arg Phe Ala Glu Val cys Glu val lie Lys Thr Gly Lys Ser 
355 360 365 

Pro Cys lie Ser Ser Ser Leu cys Leu Pro Arg 
370 375 

<210> 41 

<211> 421 

<212> PRT 

<213> Desulfovibrio desulfuricans 

<400> 41 

Met Asn Leu val Glu Met Glu Lys lie Gin Tyr Val Asp Gin ser Pro 
1 5 10 15 

Asp Pro Arg Ala Asn Pro Asp Glu Leu Phe Phe lie Gin lie Asp Pro 
20 25 30 

Glu Lys Cys lie Gly cys Asp Thr cys Gin Glu Tyr Cys Pro Thr Gly 
35 40 45 

Ala lie Phe Gly Asp Thr Gly ser Ala His ser lie Pro His Glu Glu 
50 55 60 

lie cys lie Asn Cys Gly Gin Cys Leu Thr His Cys Pro Val Gly Ala 
65 70 75 80 

lie Tyr Glu val Gin Ser Trp val Arg Glu Leu ser Glu Lys lie Lys 
85 90 95 

Asp Pro Glu lie Lys val lie Ala Met Pro Ala Pro Ala val Arg Tyr 
100 105 110 

Gly Leu Gly Glu cys Phe Gly Met Pro Val Gly Thr Val Thr Thr Gly 
115 120 125 

Lys Met Leu Thr Ala Leu Gin Met Leu Gly Phe Asp His val Trp Asp 
130 135 140 

Asn Glu Phe Thr Ala Asp Val Thr lie Trp Glu Glu Gly Thr Glu Phe 
145 150 155 160 

Val Lys Arg Leu Thr Gly Gin lie Asp Lys Pro Leu Pro Gin Phe Thr 
165 170 175 

Ser cys cys Pro Gly Trp His Lys Tyr val Glu ser Phe Tyr Pro Glu 
180 185 190 

Leu Phe Pro His Leu Ser ser cys Lys ser Pro lie Gly Met Met Gly 
195 200 205 

Ala Leu Ala Lys Thr Tyr Gly Pro Asp val Met Lys Tyr Asp Arg ser 
210 215 220 

Page 76 



WO 2005/072262 PCT/US2005/001983 

050118 CIP Sequence Listing 



Lys val Tyr Thr Val Ser He Met Pro cys Thr Ala Lys Lys Tyr Glu 
225 230 235 240 

Gly Met Arg Ala Asp Leu Trp Ser Ser Gly Tyr Lys Asp lie Asp Ala 
245 250 255 

Thr lie Asp Thr Arg Glu Leu Ala Tyr Met lie Lys Lys Ala Gly lie 
260 265 270 

Asp Phe Ala Ala Leu Pro Asp Gly Lys Arg Asp Thr Leu Met Gly Asp 
275 280 285 

Ser VIZ Gl y Gl V Ala Thr Ile ph e Gly val Ser Gly Gly val Met Glu 
290 295 300 

Ala Ala Leu Arg Tyr Ala Tyr Glu Ala Val Thr Gly Lys Lys Pro ser 
305 310 315 320 

Ser Trp Asp Phe Thr Met val Arg Gly Leu Asn Gly lie Lys Glu Gly 
325 330 335 

Thr Val Thr lie Gly Asp Ala Lys He Asn Val Ala Val Val His Gly 
340 345 350 

Ala Lys Arg Phe Ala Glu val cys Glu Val lie Lys Thr Gly Lys ser 
3 55 360 365 

Pr ° 370 11 6 GlU 375 Met Ala CyS Pr ° 38 Y Gly CyS Va1 CyS 

Gly Gly Gly Gin Pro Val Met Pro Gly val Leu Glu Ala Met Asp Arq 
385 390 395 400 

Lys val Ser Arg Thr Phe Ala Gly Leu Lys Glu Arg Leu Asn Arg Met 
405 410 415 

Ser Ser Ser Lys Ala 
420 

<210> 42 
<211> 369 
<212> PRT 

<213> Trichomonas vaginalis 
<400> 42 

Cys Asp Gly Lys Trp Leu ser Pro Ala cys Val Thr Thr Val Trp Asp 
15 10 15 

Gly Leu Lys He Asp Thr Lys Ser Lys Asn val Arg Asp ser Val Glu 
20 25 30 

Asn Asn Leu Lys Glu Leu Leu Asp cys His Asp Glu Thr cys Ser Ala 
35 40 45 

Cys lie Ala Asn His Arg Cys Gin Phe Arg Asp Met Asn val Ala Tyr 
50 55 60 
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ser val 
65 



i Unit" hihI' i' 'In in> • 

Lys Ala Glu 



Thr Lys 
70 



050118 CIP Sequence Listing 

5 Glu lie cys Ser Glu Glu Gly 
75 



lie 



Asp 
80 



Glu Ser Thr Asn Ala lie Arg Leu Asp Thr Ser Lys Cys Val Leu Cys 
85 90 95 



Gly Arg Cys lie Arg Ala cys Glu Glu Val Ala Gly Thr Ser Ala lie 
100 105 110 



lie Phe Gly Asn Arg Ala Lys Lys Met Arg lie Gin Pro Thr Phe Gly 
115 120 125 



val Thr Leu Gin Glu Thr ser Cys lie Lys cys Gly Gin cys Thr Leu 
130 135 ' 140 



Tyr Cys Pro Val Gly Ala lie Thr Glu Lys Ser Gin val Lys Glu Ala 
145 150 155 160 



Leu Asp lie Leu Ala Asn Lys Gly Lys Lys lie Thr Val Val Gin val 
165 170 175 



Ala Pro Ala Val Arg Val Ala Leu Ser Glu Ala Phe Gly Tyr Lys Glu 
180 185 190 



Gly Thr val Thr Thr Gly Lys Met val ser Ala Leu Lys Ala Leu Gly 
195 200 205 



Phe Asp Leu Val Tyr Asp Thr Asn Tyr Gly Ala Asp Leu Thr lie cys 
210 215 220 



Glu Glu Ala Gly Glu Leu val Asn Arg Leu Arg Asp Pro Asn Ala Lys 
225 230 235 240 



Phe Pro Met Phe Thr ser cys Cys Pro Ala Trp Val Asn Tyr val Glu 
245 250 255 



Gin ser Ala Pro Asp Phe lie Pro Asn Leu Ser Ser Cys Arg Ser Pro 
260 265 270 



Gin Gly Met Leu Ser Ala Leu lie Lys Asn Tyr Leu Pro Lys Leu Leu 
275 280 285 



Asp val Lys Gin Glu Asp Val Leu Asn Phe Ser lie Met Pro cys Thr 
290 295 300 



Ala Lys Lys Asp Glu Val Glu Arg Pro Glu Leu Arg Thr Lys ser Gly 
305 310 ~ 315 320 



Pro Lys Glu Thr Asp Met Val Leu Thr Val Arg Glu Leu Val Glu Met 



He Lys Leu ser Asn lie Asp Phe Asn Asn Leu Pro Asp Thr Gin Phe 
340 345 350 



Asp Asn lie Phe Gly Phe Gly ser Gly Ala Gly Gin lie Phe Ala Ala 
355 360 365 
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Thr 



<210> 43 
<211> 369 
<212> PRT 

<213> Trichomonas gallinae 
<400> 43 

cys Asp Gly Lys Trp Leu Ser Pro Ala Cys Val Thr Thr val Trp Asp 
15 10 15 

Gly Leu Arg He Asp Thr Lys Ser Lys Val Val Arg Asp Ser Val Glu 
20 25 30 

Asn Asn Leu Lys Glu Leu Leu Asp Cys His Asp Glu Thr Cys Ser Ser 
35 40 45 

cys val Ala Asn His Arg cys Gin Phe Arg Asp Met Asn Val Ala Tyr 
50 55 ~ 60 

Ser val Lys Ala Asp Thr Lys Glu lie Cys Ser Glu Glu Gly lie Asp 
65 70 75 80 

Glu Ser Thr His Ala lie Arg Leu Asp Thr ser Lys cys Val Leu Cys 
85 90 95 

Gly Arg Cys lie Arg Ala cys Glu Glu Val Ala Gly Thr Ser Ala He 
100 105 110 

lie Phe Gly Asn Arg Ala Lys His Met Arg lie Gin Pro Thr Phe Gly 
115 120 125 

Gly Thr Leu Gin Glu Thr Ala cys lie Lys cys Gly Gin cys Thr Leu 
130 135 140 

Tyr cys Pro val Gly Ala lie Thr Glu Lys ser Gin val Lys Glu Ala 
145 150 155 160 

Leu Asp lie Leu Ala Asn Lys Gly Lys Lys Val Thr Val val Gin val 
165 170 175 

Ala Pro Ala val Arg Val Ala Leu Ser Glu Ala Phe Gly Tyr Lys Glu 
180 185 190 

Gly Thr val Thr Thr Gly Lys Met val ser Ala Leu Lys Ala Leu Gly 
195 200 205 

Phe Asp Leu Val Tyr Asp Thr Asn Tyr Gly Ala Asp Leu Thr lie Cys 
210 215 220 

Glu Glu Ala Gly Glu Leu Val Asn Arg Leu Lys Asp Pro Lys Ala val 
225 230 235 240 

Phe Pro Met Phe Thr ser cys Cys Pro Ala Trp Val Asn Tyr val Glu 
245 250 255 
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Gin ser Ala Pro Asp Phe lie Pro Asn Leu Ser Ser Cys Arg ser Pro 
260 265 270 

Gin Gly Met Leu ser ser Leu lie Lys Asn Tyr Leu Pro Lys Leu Leu 
275 280 285 

Gly lie Lys Gin Glu Glu Val Met Asn Phe Ser lie Met Pro Cys Thr 
290 295 300 

Ala Lys Lys Asp Glu lie Glu Arg Pro Glu Leu Gin Thr Lys Thr Gly 
305 310 ~ 315 320 

Leu Lys Glu Thr Asp Met val Leu Thr val Arg Glu Leu Val Glu Met 
325 330 ~ 335 

lie Lys Leu ser Asn lie Asp Phe Asn Asn Leu Pro Asp Thr Pro Phe 
340 345 350 

Asp Asn lie Phe Gly Phe Gly ser Gly Ala Gly Gin lie Phe Ala Ala 
355 360 365 

Thr 



<210> 44 

<211> 456 

<212> PRT 

<213> Nyctotherus oval is 

<400> 44 

Met lie ser Arg Leu lie Ala Lys Lys Ala Pro Leu Phe Leu Arg Thr 
1 5 10 15 

Phe Ala Thr ser Glu Met lie Ser Leu Lys lie Asp Gly Lys lie lie 
20 25 30 

ser val Pro Lys Gly lie Met Leu Ala Asp Ala lie Lys Lys Ala Gly 
35 40 45 

Ala Asn val Pro Thr Met Cys Tyr His Pro Asp Leu Pro Thr ser Gly 
50 55 60 

Gly lie cys Arg Val Cys Leu val Glu Ser Ala Lys ser Pro Gly Tyr 
65 70 75 80 

Pro lie lie Ser cys Arg Thr pro val Glu Glu Gly Met Glu lie val 
85 90 95 

Thr Gin Gly Ser Lys Met Lys Glu Tyr Arg Gin Ala Asn Leu Ala Leu 
100 105 110 

Met Leu ser Arg His Pro Asn Ala Cys Leu Ser cys Thr Ser Asn Thr 
115 ~ 120 125 

Asn Cys Lys Thr Gin Glu Leu ser Ala Asn Met Asn lie Gly Gin Cys 
130 135 140 
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Gly Phe Ala Asn Ala Thr Pro Pro Lys Asn Asp Asp ser Tyr Asp Met 
145 150 155 160 

Thr Thr Ala lie Gl u Arg Asp Asn Asp Lys cys lie Asn cys Asp lie 
165 170 175 

Cys Val His Thr cys ser Leu Gin Gly Leu Asn Ala Leu Gly Phe Tyr 
180 185 190 

Asn Glu Glu Gly His Ala Val Lys Ser Met Gly Thr Leu Asp Val Ser 
195 200 205 

Glu cys lie Gin Cys Gly Gin Cys lie Asn Arg Cys Pro Thr Gly Ala 
210 215 220 

lie Thr Glu Lys Ser Glu lie Arg Pro val Leu Asp Ala lie Asn lie 
225 230 235 240 

Gin Gin Arg Leu Val Phe Gin Met Ala Pro ser lie Arg val Ala val 
245 250 255 

Ala Glu Glu Phe Gly lie Lys Pro Gly Glu Lys lie Leu Lys Asn Glu 
260 265 270 

lie Ala Thr Ala Leu Arg Lys Leu Gly ser Asn Val Phe val Leu Asp 
275 280 285 

Thr Asn Phe Ser Ala Asp Leu Thr lie lie Glu Glu Gly His Glu Leu 
290 295 300 

lie Glu Arg Leu Tyr Arg Asn val Thr Gly Lys Lys Leu Leu Gly Gly 
305 310 315 320 

Asp His Met Pro lie Asp Leu Pro Met Leu Thr ser Cys Cys Pro Gly 
325 330 335 

Trp lie Met Phe lie Glu Lys Asn Tyr Pro Asp Leu Leu Asn Asn Leu 
340 345 350 

Ser Thr cys Lys Ser Pro Gin Gly Met Leu Gly Ala Leu lie Lys Gly 
355 360 365 

Tyr Trp Ala Lys Asn lie Lys Lys Met Asp Pro Lys Asp lie Val ser 
370 375 380 

Val ser lie Met Pro Cys Thr Ala Lys Lys Ala Glu Lys Glu Arg Pro 
385 390 395 400 

Gin Leu Arg Gly Asp Glu Gly Tyr Lys Asp val Asp Tyr lie Leu Thr 
405 410 415 

Thr Arg Glu Leu Ala Lys Met Leu Lys Gin Ser Asn lie Asp Leu Ala 
420 425 430 

Lys Met Glu Pro Thr Pro Phe Asp Lys Val Met ser Glu Gly Thr Gly 
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440 445 

Ala Ala val lie Phe Gly Val Thr 
450 455 

<210> 45 

<2U> 369 

<212> PRT 

<213> Trichomonas vaginalis 

<400> 45 

Cys Asp Gly Lys Trp Leu Ala Pro Ala Cys Val Thr Thr val Trp Asp 
1 5 10 15 

Gly Leu Lys lie Asp Thr Lys ser Lys Met Val Lys Glu Ser val Glu 
20 25 30 

Asn Asn Leu Lys Glu Leu Leu Asp Cys His Asp Glu Thr cys Ser Ser 
35 40 45 

Cys val Ala Asn His Arg Cys Gin Phe Arg Asp Met Asn val Ala Tyr 
50 55 60 

ser lie Lys Ala Glu Thr Lys Glu Glu cys ser Glu Glu Gly lie Asp 
65 70 75 80 

Glu ser Thr Asn Ser lie Arg Leu Asp Thr Ser Lys cys Val Leu Cys 
85 90 95 

Gly Arg cys lie Arg Ala Cys Glu Glu Val Ala Gly Gin ser Ala lie 
100 105 110 

lie Phe Gly Asn Arg Ala Lys His Met Arg lie Gin Pro Thr Phe Gly 
115 120 125 

Gln Leu Gln As P Thr Ser c y s 11 e L y s cys Gly Gin cys Thr Leu 
130 135 140 

Tyr cys Pro Val Gly Ala He Thr Glu Lys ser Gln Val Lys Gln Ala 
145 150 155 160 

Leu Asp lie Leu ser Asn Lys Gly Lys Lys lie Ser Val lie Gln Val 
165 170 175 

Ala Pro Ala Val Arg val Ala Leu Ser Glu Ala Phe Gly Tyr Lys Glu 
180 18 5 190 

Gly ser yal Thr Thr Gly Lys Met Val ser Ala Leu Lys Ala Leu Gly 
195 200 205 

Phe Asp Tyr val Tyr Asp TKr Asn Tyr ser Ala Asp Leu Thr lie val 
210 215 220 

Glu Glu Ala Gly Glu Leu Val Gln Arg Leu Lys Asn Pro Asn Ala Val 
225 230 235 240 

Phe Pro Met Phe Thr ser cys cys Pro Ala Trp val Asn Tyr Val Glu 
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KlfJ 250 255 

Gin ser Ala Pro Asp Phe lie Pro Asn Leu Ser Ser cys Arg ser Pro 
260 265 270 

Gin Gly Met Leu Ser ser Leu val Lys Asn Tyr Leu Pro Lys val Leu 
275 280 285 

Asn lie Pro Val Glu Asp Val Leu Asn Phe Ser He Met Pro Cys Thr 
290 295 300 

Ala Lys Lys Asp Glu He Glu Arg Pro Glu Leu Arg Thr Lys Asp Gly 
305 310 315 320 

His Lys Glu Thr Asp Met Val Leu Thr Val Arg Glu Leu val Glu Met 
325 330 335 

lie Lys Leu Ser Gly lie Asp Phe Asn Asn Leu Pro Asp Thr Pro Phe 
340 345 350 

Asp ser lie Phe Gly Phe Gly ser Gly Ala Gly Gin lie Phe Ala Ala 
355 360 365 

Thr 



<210> 46 
<211> 464 
<212> PRT 

<213> Entamoeba histolytica 
<400> 46 

Arg Leu His Thr val Thr Gly His Asp His Asn His ser lie Gin Phe 
15 10 15 

Asp Trp ser Lys cys Met Gly cys Gly Met cys Ala Thr Lys cys Thr 
20 25 30 

Phe Gly Val Leu val Lys Gin Pro Pro Lys lie Pro Pro Phe val Gin 
35 x 40 45 

Pro Asn Arg Glu Lys Leu Ser Gin Glu Asn Thr Asp Lys Thr Arg Val 
50 55 60 

Leu lie Asp Glu Ser Glu Cys Thr Gly cys Gly Gin Cys Ser Leu Val 
65 70 75 80 

Cys Asn Phe Gly ser lie Thr Pro lie Asp His Leu val Asp Thr Phe 
85 90 95 

Lys Ala Lys Glu Ala Gly Lys Lys Leu val Ala Met lie Ala Pro Ser 
100 105 110 

Thr Arg Leu Gly Val Ala Glu Ala Met Gly Met Pro lie Gly ser Thr 
115 120 125 

Ala Met Ala Gin Leu Val His Cys Leu Arg Leu lie Gly Phe Asp Tyr 
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IVfJS □ 5 / 0 .1 9 8 3135 H 140 

val Phe Asp Val Asp Ala Gly Ala Asp Lys Thr Thr Met Asp Asp Tyr 
145 150 155 160 

Ala Glu val lie Glu Met Lys Lys Glu Gly Lys Gly Pro Ala lie Thr 
165 170 175 

ser cys cys Pro Ala Trp lie Glu Leu val Glu Lys Glu Tyr Pro asd 
180 185 190 

Leu lie Pro Asn Val ser Thr Ala Arg Ser Pro He Gly cys Leu Ala 
195 200 205 

Gly SYS 11 e Lys Ar 9 Gly Tr P Ala L y s As P Va l He Ala Val Glu 

210 215 220 

Asp Leu Tyr Thr Val Gly lie Met Pro Cys He Ala Lys Lys Thr Glu 
225 230 235 240 

ser Gin Arg Gin Gin lie His Gin Asp Tyr Asp Ala Ser cys Thr ser 
245 250 255 

Asn Glu lie Ala Ala Tyr Phe Lys Lys His Leu Pro Pro Glu Glu cvs 
260 265 270 

Lys Phe Thr Gin Glu Arg Glu Glu Ala Leu Ala Lys Thr Glu Asp Gly 
275 280 285 

Gln SXS Asp Leu Pro phe Ar 9 Ar 9 11 e Ser Gly Gly ser Asn lie Phe 
290 295 30O 

Gly Arg Thr Gly Gly val Cys Glu Thr Val Leu Arg Val lie Ala Arq 
305 310 315 320 

Asn Ala Gly val Asp Trp Asn Ser cys Thr Val Asn Lys Glu Glu Thr 
325 330 335 

Phe Lys His Ala Ala Ser Gly Ser Thr Met Thr Asn Leu Ser Val Asp 
340 345 350 

lie Gly Gly Thr lie lie Thr Gly Ala val cys His Gly Gly Tyr Ala 
355 360 365 

lie Arg His Ala cys Glu Leu lie Arg Lys Gly Glu Leu Lys val Asp 
370 375 380 

val val Glu Met Met Ala cys val Gly Gly Cys Leu Gly Gly Ala Gly 
385 390 395 400 

Gin Pro Lys lie Pro Pro Ala Lys Lys Leu Glu Met Asp Lys Arq Arq 
405 410 415 

Val Met Leu Asp lie Leu Asp Gin Gin Thr Asp lie Arg Ala Ala Asn 
420 425 430 
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P OT.^Ii&O'Il^^l^Jll %IM M y Trp He Asp Lys His Phe Asp His Gin 
435 440 445 

Gly Ala His Gin His Leu His Thr Tyr Phe Thr pro Arg Tyr Gin Asn 
450 455 460 

<210> 47 
<211> 474 
<212> PRT 

<213> Giardia intestinal is 
<400> 47 

Met Pro Pro Lys Pro Gin His Asp Val Thr Gly Val Asp Ser Asn Asn 
15 10 15 

Ala lie Met lie Asp Tyr Ala Lys cys He Gly cys Asn Met cys lie 
20 25 30 

Lys Ala cys Asp val Gin Gly lie Gly Val Tyr Lys Gin Asn Glu Lys 
35 40 45 

Pro Lys Tyr Pro Pro lie val Lys Leu ser Thr Leu Phe Asn Ser Asp 
50 55 60 

Cys lie Gly Cys Gly Gin Cys Ala Thr lie Cys Pro val Asp Ala lie 
65 70 75 80 

Ala Pro Lys Asn Asn Leu Glu lie Tyr Lys Gly Glu Ser Ala Ser Lys 
85 90 95 

Lys val Arg Val Ala Leu lie Ala Pro Ser Thr Arg val Ala Phe Gly 
100 105 110 

Asp val Phe Gly Leu Pro lie Gly Thr Asn Thr lie Tyr Ser Leu lie 
115 120 125 

Arg Met Leu Lys Gin Tyr Leu Gly Phe Asp Tyr Val Phe Asp val Asn 
130 135 140 

Phe Gly Ala Asp Glu Thr Thr Val lie Asp Thr Gin Glu Leu Leu His 
145 150 155 160 

Phe Lys His Glu Gly Arg Gly Pro Val Phe Thr ser Cys Cys Pro Ala 
165 170 175 

Trp val Asn Leu cys Glu Met Lys Tyr Pro Glu Leu Leu Pro Gin val 
180 185 190 

Ser Thr Ala Lys Ser cys val Ala Met Val Ala Thr Leu val Lys Arg 
195 200 205 

Arg Trp Val Gin Glu His Leu lie Pro Lys Gly lie val Asp Ser val 
210 215 220 

Asp Asp Val Tyr Val Ala Asp lie Met Pro cys Thr Ala Lys Lys Asp 
225 230 235 240 
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^Xil^BleBffiBiA'rplir&^dJLeu Asn Arg Asp Val Asp lie Cys Leu Thr 
245 250 255 

Val Arg Glu Val Ala Glu His Leu Tyr Phe Leu His Gly Ala Arq Leu 
260 265 270 

Thr Leu Glu Glu val Glu Ala Asp Ala Leu Val Leu Arg Pro Gly Arg 
275 280 285 

ser Thr Gin Lys Lys Trp Asp Phe Asp Ala Pro Phe Asn Thr Val ser 
290 295 300 

Gly Gly ser His lie Phe Gly Lys Thr Gly Gly Val Ala Glu Thr Cys 
305 310 315 320 

Leu Arg Phe lie Ser Tyr Met Lys Lys ser Pro lie Glu Asn Val Lys 
325 330 335 

Glu Glu Leu Leu Lys Glu Phe Lys Thr Pro Gly Gin Leu val Gin Thr 
340 345 350 

val Lys Leu val ser cys Glu lie Ala Gly Glu Thr Tyr Arg Ala Leu 
355 360 365 

lie Ala His Gly Gly Ser Ala lie Asn Ala Ala Ala Arg Met val Leu 
370 375 380 

Asn Lys Glu val Glu cys Asp Val Val Glu Gin Met Ala Cys Pro Gly 
385 390 395 400 

Gly cys Gin Asn Gly Gly Gly Met Pro Lys lie Lys Gly Lys Lys Glu 
405 410 415 

Ala Val Leu Thr Arg Ala Ser Thr Leu Asp lie Leu Asp Gly Lys Glu 
420 425 430 

Arg Phe Ala Ser Ala Gly Glu Asn Lys Thr Leu Trp Gly Phe Asn Gly 
435 440 445 

cys Leu Thr Glu His Glu Ala His Glu Leu Leu His Thr His Tyr Gin 
450 455 460 

His Arg Pro val Glu Ser Leu Leu Pro Gin 
465 470 

<210> 48 
<211> 844 
<212> PRT 

<213> Desulfitobacterium hafniense 
<400> 48 

Met Val Lys lie lie ser lie Thr Asn Asn Ala Lys Arg Gin Gly Lys 
15 10 15 

Gly Thr ser Arg Lys Glu Lys Gin Ala Met Lys Glu Val Thr Lys Gin 
20 25 30 
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35 40 45 

Asp Leu Thr He Leu Gin Ala Leu Leu Gin Glu Asp lie His lie Pro 
50 55 60 

His Leu Cys Tyr Asp lie Arg Leu Glu Arg ser Asn Gly Asn cys Gly 
65 70 75 80 

Leu cys val val Glu Leu Gly Glu Gly ser Glu Gin Gin Asp Val Lys 
85 90 95 

Ala cys His Thr Pro lie Gin Glu Gly Met lie lie His Thr Asn ser 
100 105 110 

Pro Arg Leu Glu His Tyr Arg Lys lie Arg Leu Glu Gin lie Leu Ala 
115 120 125 

Asp His Asn Ala Asp cys Val Ala Pro cys val Met Thr cys Pro Ala 
130 135 140 

Asn lie Asp lie Gin Ser Tyr Leu ser His Ala Gly Asn Gly Asn Phe 
145 150 155 160 

Glu Thr Ala lie Lys val lie Lys Glu Arg Asn Pro Phe Pro lie Val 
165 170 175 

cys Gly Arg Val cys Pro His ser cys Glu Ala Gin cys Arg Arg Asn 
180 185 190 

Leu lie Asp Glu Pro val Ala lie Asn His Val Lys Arg Phe lie Ala 
195 200 205 

Asp Trp Asp lie Ala His Glu Gin Pro Trp Ala Pro Arg Lys Lys Ala 
210 215 220 

Ala Thr Gly Lys Lys lie Ala val Val Gly Ala Gly ser ser Gly Leu 
225 230 235 240 

ser Ala Ala Tyr Tyr ser Ala lie Gin Gly His Asp Val Thr Val Phe 
245 250 255 

Glu Arg His Pro Arg Ala Gly Gly Met Met Arg Tyr Gly lie pro Glu 
260 265 270 

Tyr Arg Leu Pro Lys Glu Thr Leu Asp Arg Glu lie Gly Leu lie Ala 
275 280 285 

Asp Leu Gly Val Lys lie Met Thr Asn Lys Ala Leu Gly Thr His lie 
290 295 3O0 

Arg Leu Glu Asp Leu His Gin Asp Phe Asp Ala Val Tyr Leu Ala lie 
305 310 315 320 

Gly Ser Trp Arg Ala Thr Pro Leu Gin lie Glu Gly Asp Asn Leu Glu 

325 330 335 
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<aiy vai Trp Leu Gly lie Asn Phe Leu Glu Gin val Thr Lys Gly Ala 
340 345 350 

Asp lie Lys Leu Gly Glu His val Val val lie Gly Gly Gly Asn Thr 
355 360 365 

Ala lie Asp Cys Ala Arg Thr Ala Leu Arg Lys Gly Ala Gly ser val 
370 375 380 

Lys Leu val Tyr Arg Arg Thr Arg Glu Glu Met pro Ala Glu ser Tyr 
385 390 ~ 395 400 

Glu val Glu Glu Ala lie His Glu Gly Val Glu Met Tyr Phe Leu Thr 
405 410 415 

Ala Pro His Lys lie Val Ala Glu Gly Gly Arg Lys Leu Leu His Cys 
420 425 430 

lie Lys Met Thr Leu Gly Glu Pro Asp Arg ser Gly Arg Arg Arg Pro 
435 440 ~ 445 

lie Pro lie Glu Gly ser Glu Thr Ala Phe Glu Ala Asp Thr lie lie 
450 455 460 

Gly Ala lie Gly Gin ser Thr Asn Thr Gin Phe Leu Tyr His Asp Leu 
465 470 475 480 

Pro Val Lys Leu Asn Lys Trp Gly Asp lie Glu lie Asn Gly Lys Thr 
485 490 495 

Met Gin Thr Ser Glu Met Asn lie Phe Ala Gly Gly Asp cys val Thr 
500 505 510 

Gly Pro Ala Thr val lie Gin Ala val Ala Ala Gly Arg His Ala Ala 
515 520 525 

Glu Ala Met Asp Ser Phe Leu Met Lys Gly Tyr val Lys Glu Gin Pro 
530 535 540 

Met Asp Tyr Ser Cys Ser Arg Gly Ser Leu Glu Asp Leu Pro Gin Trp 
545 550 555 560 

Glu Phe Glu Lys lie Pro Arg Leu Lys Arg Ala Pro Met Pro Ala Leu 
565 ^ 570 575 

Pro Pro Ala Glu Arg Arg Asp Asn Phe Arg Glu Val Glu Thr Gly Leu 
580 ~ 585 590 

Ser Glu Glu Thr Ala Arg Ala Glu Ala Arg Arg cys Leu Lys cys Gly 
595 600 605 

Cys Tyr Glu Arg Tyr Asp Cys Asp Leu Arg Gin Glu Ala Ser Leu His 
610 615 620 

His Val Glu Phe Lys Lys Pro Val His Glu Arg Pro Tyr lie Pro lie 
625 630 635 640 
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val Glu Asp His ser He lie lie Arg Asp His Afen Lys cys lie Ser 
645 650 655 

cys Gly Arg Cys lie Ala Ala Cys Ala Glu Val Glu Gly Pro Asp lie 
660 665 670 

Leu ser Phe Tyr Met Lys His Gly Arg Gin Leu Val Gly Thr Lys Ser 
675 680 685 

Gly Leu Pro Leu Asp Gin Thr Asp Cys Val Ser Cys Gly Gin Cys Val 
690 695 700 

Asn Ala Cys Pro Cys Gly Ala Leu Asp Tyr Arg ser Glu lie Gly Arg 
705 710 715 720 

val Phe Arg Ala lie Asn Asp Pro Gly Lys Thr Thr val Ala Phe val 
725 730 735 

Ala Pro Ala Val Arg Ser Val Val Ser ser Gin Tyr Gly Val ser Tyr 
740 745 750 

Gin Glu Ala Ser Arg Phe lie Ala Gly Leu Leu Lys Lys lie Gly Phe 
755 760 765 

Asp Lys Val Phe Asp Phe Thr Phe Ala Ala Asp Leu Thr lie Val Glu 
770 775 780 

Glu Thr Thr Glu Phe Leu Thr Arg Leu Gin Ser Hi s Lys Pro lie Pro 
785 790 795 800 

Gin Phe Thr Ser Cys Cys Pro Gly Trp Val Asn Phe Val Glu Arg Arg 
805 810 815 

Tyr Pro Glu lie lie Pro Tyr Leu Ser Ser Cys Lys ser Pro Gin Met 
820 - 825 830 

Met Met Gly Ala Thr val Lys lie Thr Leu Arg Asn 
835 840 

<210> 49 
<211> 119 
<212> PRT 

<213> Nyctotherus velox 
<400> 49 

lie Leu Phe Met Glu Lys Asn Tyr Pro Asp Met Leu Asn His Leu ser 
15 10 15 

Thr cys Lys Ser Pro Gin Gly Met Leu Gly Ala Leu lie Lys Gly Tyr 
20 25 30 

Trp Ala Lys Asn val Lys Lys lie Asp Pro Lys Asp Val Val ser val 
35 40 45 

Ser lie Met Pro cys Thr Ala Lys Lys Glu Glu Lys Asp Arg lie Thr 
50 55 60 
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Leu Lys Ser Asp Glu Gly Tyr Asn Asn Val Asp Tyr val Leu Thr Thr 
65 70 75 80 

Arg Glu Leu Ala Lys Met Phe Lys Gin Ser Asn lie Asp Pro Ser Lys 
85 90 95 

Leu Pro Pro Thr Gin Phe Asp Asn Val Met Ser Glu Gly Thr Gly Ala 
100 105 110 

Ala val lie Phe Gly Val Thr 
115 

<210> 50 

<211> 476 

<212> PRT 

<213> Oryza sativa 

<400> 50 

Met Ala Ser ser ser Ser ser Ala ser ser Arg Phe Ser Pro Ala Leu 
1 5 10 15 

Gin Ala Ser Asp Leu Asn Asp Phe lie Ala Pro Ser Gin Asp Cys lie 
20 25 30 

lie Ser Leu Asn Lys Gly Pro Ser Ala Arg Arg Leu Pro lie Lys Gin 
35 40 ~ 45 

Lys Glu lie Ala Val Ser Thr Asn Pro Pro Glu Glu Ala Val Lys lie 
50 55 60 

Ser Leu Lys Asp Cys Leu Ala Cys Ser Gly Cys lie Thr Ser Ala Glu 
65 70 75 80 

Thr val Met Leu Glu Lys Gin Ser Leu Gly Asp Phe lie Thr Arg lie 
85 90 95 

Asn Ser Asp Lys Ala Val lie val Ser val ser Pro Gin ser Arg Ala 
100 105 110 

Ser Leu Ala Ala Phe Phe Gly Leu ser Gin Ser Gin val Phe Arg Lys 
115 120 125 

Leu Thr Ala Leu Phe Lys Ser Met Gly Val Lys Ala val Tyr Asp Thr 
130 135 140 

Ser Ser ser Arg Asp Leu Ser Leu lie Glu Ala Cys Ser Glu Phe val 
145 ~ 150 15 5 160 

Thr Arg Tyr His Gin Asn Gin Leu Ser Ser Gly Lys Glu Ala Gly Lys 
165 170 "* 175 

Asn Leu pro Met Leu ser ser Ala cys Pro Gly Trp lie cys Tyr Ala 
180 185 190 

Glu Lys Thr Leu Gly Ser Phe lie Leu Pro Tyr lie Ser Ala val Lys 
195 200 ~ 205 
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Ser Pro Gin Gin Ala lie Gly Ala Ala lie Lys His His Met Val Gly 
210 215 220 

Lys Leu Gly Leu Lys Pro His Asp Val Tyr His val Thr val Met Pro 
225 230 235 240 

Cys Tyr Asp Lys Lys Leu Glu Ala Val Arg Asp Asp Phe Val Phe ser 
245 250 255 

val Glu Asp Lys Asp val Thr Glu val Asp ser Val Leu Thr Thr Gly 
260 265 270 

Glu Val Leu Asp Leu lie Gin ser Arg ser val Asp Phe Lys Thr Leu 

275 280 "* 285 

Glu Glu ser Pro Met Asp Arg Leu Leu Thr Asn val Asp Asp Asp Gly 
290 295 300 

Gin Leu Tyr Gly Val Ser Gly Gly Ser Gly Gly Tyr Ala Glu Thr Val 
305 310 315 320 

Phe Arg His Ala Ala His val Leu Phe Asp Arg Lys lie Glu Gly Ser 
325 330 335 

val Asp Phe Arg lie Leu Arg Asn ser Asp Phe Arg Glu Val Thr Leu 
340 345 ~ 350 

Glu val Glu Gly Lys Pro val Leu Lys Phe Ala Leu cys Tyr Gly Phe 
355 360 365 

Arg Asn Leu Gin Asn lie lie Arg Lys lie Lys Met Gly Lys Cys Glu 
370 375 380 

Tyr His Phe lie Glu Val Met Ala Cys Pro Ser Gly Cys Leu Asn Gly 
385 390 395 400 

Gly Gly Gin lie Lys Pro Ala Lys Gly Gin ser Ala Lys Asp Leu lie 
405 410 415 

Gin Leu Leu Glu Asp val Tyr lie Gin Asp Val Ser val Ser Asn Pro 
420 425 430 

Phe Glu Asn Pro lie Ala Lys Arg Leu Tyr Asp Glu Trp Leu Gly Gin 
435 440 445 

Pro Gly Ser Glu Asn Ala Lys Lys Tyr Leu His Thr Lys Tyr His Pro 
450 455 460 

Val Val Lys Ser Val Ala Ser Gin Leu Gin Asn Trp 
465 470 475 

<210> 51 
<211> 114 
<212> PRT 

<213> Psalteriomonas lanterna 
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lie Asn Leu val Glu Lys His Tyr Pro Glu Tyr Leu Pro Asn Leu Ser 
15 10 15 

ser cys Arg ser Pro Gin Gly Met Leu Ser ser Leu lie Lys Asn Tyr 
20 25 30 

Trp Ala Lys Lys Met Gly lie Glu Pro Lys Asp val Val Val Val ser 
35 40 45 

Phe Met Pro Cys Gly Ala Lys Lys Asp Glu lie Lys Arg Pro Gin Leu 
50 55 60 

Lys Gly Glu Thr Asp Tyr val Leu Thr Thr Arg Glu Leu Gly Lys Leu 
65 70 75 80 

Phe Lys Met Gly Gly Leu Asn Asp Leu Ser val Leu Glu Pro Val Lys 
85 90 95 

Tyr Asp Asp Pro Leu Gly Glu Ser Thr Gly Ala Ala val lie Phe Gly 
100 105 110 

Ala Thr 



<210> 52 

<211> 119 

<212> PRT 

<213> Nyctotherus oval is 

<400> 52 

lie Met Phe Met Glu Lys Asn Tyr Pro Asp Met Leu Asn His Leu Ser 
1 5 10 15 

Thr Cys Lys Ser Pro Gin Gly Met Leu Gly Ala Leu lie Lys Gly Tyr 
20 25 30 

Trp Ala Lys Asn lie Lys Lys Met Asp Pro Lys Asp lie Val Ser val 
35 40 45 

ser lie Met Pro cys Thr Ala Lys Lys Ala Glu Lys Glu Arg Pro Gin 
50 , 55 60 

Leu Arg Gly Asp Glu Gly Tyr Lys Asp Val Asp Tyr lie Leu Thr Thr 
65 70 75 80 

Arg Glu Leu Ala Lys Met Leu Lys Gin Ser Asn lie Asp Leu Gly Lys 
85 90 95 

Met Glu Pro Thr Pro Phe Asp Lys Val Met Ser Glu Gly Thr Gly Ala 
100 105 110 

Ala val lie Phe Gly val Thr 
115 

<210> 53 
<211> 119 
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<213> Nyctotherus oval is 
<400> 53 

lie Met Phe Met Glu Lys Asn Tyr Pro Asp Met Leu Asn His Leu Ser 
15 10 15 

Thr cys Lys ser Pro Gin Gly Met Leu Gly Ala Leu lie Lys Gly Tyr 
20 25 30 

Trp Ala Lys Asn Val Lys Lys Met Asp Pro Lys Asp lie Val Ser Val 
35 40 45 

Ser He Met Pro Cys Thr Ala Lys Lys Ala Glu Lys Glu Arg pro Gin 
50 55 60 

Leu Arg Gly Asp Glu Gly Tyr Lys Asp Val Asp Tyr lie Leu Thr Thr 
65 70 75 80 

Arg Glu Leu Ala Lys Met Leu Lys Gin ser Asn lie Asp Leu Gly Lys 
85 90 95 

Met Glu Pro Arg Pro Phe Asp Lys Val Met Ser Glu Gly Thr Gly Ala 
100 105 110 

Ala Val lie Phe Gly val Thr 
115 

<210> 54 
<211> 520 
<212> PRT 

<213> Rhodospi rill urn rubrum 
<400> 54 

Met Arg Pro Val Gin Arg Pro Arg Arg Trp Pro Gly Leu Arg Gin Arg 
15 10 15 

Leu ser Pro Glu Arg Pro Val Asp Arg Arg Ser Arg Arg Arg ser Gly 
20 25 30 

Ala Ala Arg Pro Gly Arg Arg Arg Gly Ser Gly Val Gin His Glu lie 
35 40 45 

Leu Arg ser Val ser Gin Arg Asp Met Ser Met ser lie Gin Pro Thr 
50 55 60 

Val Thr lie Asp Pro Glu Leu cys Thr Gly Cys Gly Arg cys Val Glu 
65 70 75 80 

Thr cys Pro Val Gin Ala lie Ala Gly Ser Arg Gly Lys Ala His Glu 
85 90 95 

He Glu Ala Ala Ala cys Val Ser Cys Gly Arg Cys val Ala Thr Cys 
100 105 110 

Ala Ala Phe Asp Ser lie Phe Asp Ala Phe Pro Thr Pro Arg Pro Val 
115 120 125 
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Arg Leu Lys Arg Arg Gly Leu Pro Gly ser Leu Lys Glu Pro Leu Phe 
130 ~ 135 140 

Ala Ala His Asp Pro Ser Arg lie Glu Ala Val Arg Lys Ala Phe Ala 
145 150 ~ 155 160 

Thr Pro Lys Arg Met Thr Val Met Gin Val Asp Thr Met Ala Cys Val 
165 170 175 

Ala Leu Ala Glu Asp Phe Gly Leu Pro Pro Gly Ser Leu Ser Pro Leu 
180 185 190 

Lys lie Ala Ser Ala Ala Arg Gin Leu Gly Phe Asp Arg Val Tyr Arg 
195 200 205 

Thr ser Phe Pro Ala Gly Leu Ala val Leu Glu Thr Ala His Glu Met 
210 215 220 

Ala Ala Arg Leu Ala Asn Gly Gly Asn Leu Pro val lie Asn ser Ser 
225 230 235 240 

cys Pro Ala Val Val Ala Phe Leu Glu Arg Arg Tyr Pro Glu Leu Leu 

245 250 255 

His Tyr Leu Ser Thr Val Lys Ser Pro His Gin lie Ala Gly Ala Leu 
260 265 270 

Tyr Asn ser Tyr Leu Ala Asp Ala Ala Asn Leu Ala Pro Ala Asn lie 
275 280 285 

His Lys val ser val val Ala cys Leu ser His Lys Ala Glu Ala Glu 
290 295 300 

Arg Pro Glu Met Met Thr Cys Gly Cys Pro Asp lie Asp Thr val Leu 
305 310 315 320 

Thr Ala Arg Glu Leu Ala lie Leu lie Lys Asp Ala Gly lie Asp val 
325 330 335 

Pro Leu Leu Gly Asp Gly Glu Phe Asp Asn Asp Phe Pro Glu lie Glu 
340 345 350 

Gly Leu Asp Thr Leu Tyr Cys Ala Pro Gly Asp Val Ser Arg Ala Val 
355 360 365 

Leu Gly Ala Gly Arg Trp Phe Leu Gly Gin Gly Glu Gly val Gly Ala 
370 375 380 

Pro Ala Gly Glu Thr val Glu Val Leu Asp Glu Ala Thr Arg Leu Thr 
385 390 395 "* 400 

Arg Leu Ala Tyr Pro Gly Gly Thr Leu Gin Ala Leu Thr Val Ala Gly 
405 410 415 

Phe Asp Lys Ala Val Pro Tyr Leu Glu Ala lie Lys Ala Gly Arg Asn 
420 425 430 
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Ala Phe Gin Phe Leu Glu lie Ala Ser Cys Pro Gin Gly cys Ala Ser 
435 440 445 

Gly Ala Gly Leu Pro Lys Val Leu Leu Glu Thr Glu Lys Pro Ala Arg 
450 455 460 

Tyr Arg Ala Arg lie Glu Asn Leu Pro Pro Ala Ala Pro Glu Ala Trp 
465 " 470 475 480 

Ser Arg Leu Pro Gly His Pro Ser lie Val Ala Leu Tyr Gly Gly Tyr 
485 490 * 495 

Phe Gly Lys Ala lie Gly Asp Lys Ser Asn Arg Arg Leu His Thr Gin 
500 505 ^ 510 

Tyr Ala Glu Pro Ala Ala Ala Pro 
515 520 

<210> 55 
<211> 240 
<212> PRT 

<213> Desulfi tobacterium hafniense 
<400> 55 

Met Ala val Glu Lys Leu Thr Gly Glu val Leu Thr Asp Gin Leu Asp 
15 10 15 

Tyr Gin Glu Val Arg Gly Leu Gin Gly He Lys Glu Ala Ala Val Glu 
20 " 25 30 

Ala Lys Gly Lys Lys Val Asn Val Ala val lie Ser Gly Leu His Asn 
35 40 45 

val Glu Pro lie Leu Glu Lys lie lie Glu Gly Met Glu val Gly Tyr 
50 55 60 

Asp Leu lie Glu val Met Ala Cys Pro Gly Gly Cys lie Cys Gly Ala 
65 70 75 80 

Gly His Pro Val Pro Glu Lys lie Asp Thr Leu Glu Lys Arg Gin Gin 
85 * 90 95 

Val Leu val Asn lie Asp Gin Thr Ser Arg Tyr Arg Lys ser Gin Glu 
100 105 110 

Asn Pro Asp lie Leu Arg Leu Tyr Asp Glu Tyr Tyr Gly Glu Ala Asn 
115 " 120 125 

ser Pro Leu Ala His Lys Leu Leu His Thr His Tyr Glu Ala Val Lys 
130 135 140 

Arg Glu Pro Val Ala Lys His Asp Arg Arg Met Ala Asp Ser Ala Phe 
145 150 155 160 

val Thr His Glu Leu Thr Leu cys Thr Cys Asp Lys cys Thr Ala Gin 
165 170 175 
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Gly Ser Arg Glu Leu Phe Ala Ala Leu Ser Gly Lys lie Arg Lys Leu 
180 185 190 

Lys Met Asp ser Phe Val Thr Ala Arg Thr lie Arg Leu Lys Glu Asn 
195 200 205 

His Pro Gly Gin Gly Val Tyr Ala Ala lie Asp Gly Lys Leu lie Glu 
210 215 220 

Thr Pro val Glu Gin Leu Glu Gin Arg lie Phe Gin His Leu lie Arg 
225 230 ~ 235 240 

<210> 56 
<211> 86 
<212> PRT 

<213> Desulfi tobacterium hafniense 
<400> 56 

Met Val Ser lie Val Pro Cys lie Ala Lys Lys Tyr Glu Ala Ala Arg 
15 10 15 

Pro Glu Phe Arg Ser Glu Gly lie Arg Asp Val Asp Ala Val Leu Thr 
20 25 30 

Ser Thr Glu Met Leu Glu Met Ala Asp lie Lys Leu lie Glu Pro Ala 
35 40 - 45 

Asp val Glu Pro Gin Asp Phe Cys Glu Pro Tyr Lys Arg val ser Gly 
50 55 60 

Ala Gly lie Leu Phe Gly Ala Ser Gly Gly val Ala Lys Arg Pro Cys 
65 70 75 80 

Gly Trp Arg Trp Arg Asn 
85 

<210> 57 
<211> 477 
<212> PRT 

<213> Drosophila melanogaster 
<400> 57 

Met Ser Arg Leu Ser Arg Ala Leu Gin Leu Thr Asp lie Asp Asp Phe 
1 ~ 5 10 15 

lie Thr pro ser Gin lie cys lie Lys Pro val Gin lie Asp Lys Ala 
20 25 30 

Arg Ser Lys Thr Gly Ala Lys lie Lys lie Lys Gly Asp Gly cys Phe 
35 40 45 

Glu Glu Ser Glu ser Gly Asn Leu Lys Leu Asn Lys Val Asp lie ser 
50 55 60 

Leu Gin Asp Cys Leu Ala cys Ser Gly Cys lie Thr ser Ala Glu Glu 
65 70 75 80 
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Val Leu lie Thr Gin Gin Ser Arg Glu Glu Leu Leu Lys Val Leu Gin 
85 " 90 95 

Glu Asn Ser, Lys Asn Lys Ala Ser Glu Asp Trp Asp Asn Val Arg Thr 
100 105 110 

lie Val Phe Thr Leu Ala Thr Gin Pro lie Leu Ser Leu Ala Tyr Arg 
115 120 125 

Tyr Gin lie Gly Val Glu Asp Ala Ala Arg His Leu Asn Gly Tyr Phe 
130 135 140 

Arg ser Leu Gly Ala Asp Tyr val Leu ser Thr Lys Val Ala Asp Asp 
145 150 155 160 

lie Ala Leu Leu Glu Cys Arg Gin Glu Phe val Asp Arg Tyr Arg Glu 
165 ~ 170 175 

Asn Glu Asn Leu Thr Met Leu Ser Ser ser cys Pro Gly Trp val cys 
180 185 190 

Tyr Ala Glu Lys Thr His Gly Asn Phe Leu Leu Pro Tyr val Ser Thr 
195 200 205 

Thr Arg ser Pro Gin Gin lie Met Gly Val Leu Val Lys Gin lie Leu 
210 215 220 

Ala Asp Lys Met Asn val Pro Ala Ser Arg lie Tyr His Val Thr Val 
225 230 235 240 

Met Pro Cys Tyr Asp Lys Lys Leu Glu Ala Ser Arg Glu Asp Phe Phe 
245 250 255 

Ser Lys Ala Asn Asn Ser Arg Asp Val Asp Cys Val lie Thr ser val 
260 265 270 

Glu Val Glu Gin Leu Leu Ser Glu Ala Gin Gin Pro Leu ser Gin Tyr 

275 280 285 

Asp Leu Leu Asp Leu Asp Trp Pro Trp Ser Asn Val Arg Pro Glu Phe 
290 295 300 

Met Val Trp Ala His Glu Lys Thr Leu ser Gly Gly Tyr Ala Glu His 
305 310 315 320 

lie Phe Lys Tyr Ala Ala Lys His lie Phe Asn Glu Asp Leu Lys Thr 
325 330 335 

Glu Leu Glu Phe Lys Gin Leu Lys Asn Arg Asp Phe Arg Glu lie lie 
340 345 350 

Leu Lys Gin Asn Gly Lys Thr Val Leu Lys Phe Ala lie Ala Asn Gly 
355 360 365 

Phe Arg Asn lie Gin Asn Leu Val Gin Lys Leu Lys Arg Glu Lys Val 
370 375 380 
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ser Asn Tyr His Phe Val Glu Val Met Ala cys Pro ser Gly Cys lie 
385 390 395 400 

Asn Gly Gly Ala Gin lie Arg Pro Thr Thr Gly Gin His Val Arg Glu 
405 410 415 

Leu Thr Arg Lys Leu Glu Glu Leu Tyr Gin Asn Leu Pro Arg Ser Glu 
420 425 430 

Pro Glu Asn Ser Leu Thr Lys His lie Tyr Asn Asp Phe Leu Asp Gly 
435 440 445 

Phe Gin Ser Asp Lys Ser Tyr Asp Val Leu His Thr Arg Tyr His Asp 
450 455 460 

val val ser Glu Leu ser lie ser Leu Asn lie Asn Trp 
465 470 475 

<210> 58 

<211> 538 

<212> PRT 

<213> s. pombe 

<400> 58 

Met Ala Lys Leu ser val Asn Asp Leu Asn Asp Phe Leu ser Pro Gly 
1 5 10 15 

Ala val cys lie Lys Pro Ala Gin Val Lys Lys Gin Glu ser Lys Asn 
20 25 30 

Asp lie Arg lie Asp Gly Asp Ala Tyr Tyr Glu val Thr Lys Asp Thr 
35 40 45 

Gly Glu Thr Ser Glu Leu Gly lie Ala Ser lie Ser Leu Asn Asp Cys 
50 55 60 

Leu Ala cys Ser Gly cys lie Thr ser Ala Glu Thr val Leu val Asn 
65 70 75 80 

Leu Gin ser Tyr Gin Glu val Leu Lys His Leu Glu Ser Arg Lys Ser 
85 90 95 

Gin Glu lie Leu Tyr Val Ser Leu ser Pro Gin val Arg Ala Asn Leu 
100 105 110 

Ala Ala Tyr Tyr Gly Leu Ser Leu Gin Glu lie Gin Ala Val Leu Glu 
115 120 125 

Met val Phe lie Gly Lys Leu Gly Phe His Ala lie Leu Asp Thr Asn 
130 135 140 

Ala Ser Arg Glu lie Val Leu Gin Gin Cys Ala Gin Glu Phe cys Asn 
145 150 155 160 

ser Trp Leu Gin Ser Arg Ala His Lys Asn Gin Asn Gin val Thr Asn 
165 170 175 
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Ser val val Asn Glu His Pro Leu lie Pro His Ser Thr Ser Gin lie 
180 185 190 

Ser Gly val His ser Asn Thr ser ser Asn Ser Gly lie Asn Glu Asn 
195 200 205 

Ala val Leu Pro lie Leu ser ser Ser Cys Pro Gly Trp lie Cys Tyr 
210 215 220 

Val Glu Lys Thr His Ser Asn Leu lie Pro Asn Leu Ser Arg Val Arg 
225 230 235 240 

ser Pro Gin Gin Ala Cys Gly Arg lie Leu Lys Asp Trp Ala Val Gin 
245 250 255 

Gin Phe ser Met Gin Arg Asn Asp Val Trp His Leu Ser Leu Met Pro 
260 265 270 

Cys Phe Asp Lys Lys Leu Glu Ala ser Arg Asp Glu Phe Ser Glu Asn 
275 280 285 

Gly val Arg Asp Val Asp Ser Val Leu Thr pro Lys Glu Leu val Glu 
290 295 300 

Met Phe Lys Phe Leu Arg lie Asp Pro lie Glu Leu Thr Lys Asn Pro 
305 310 315 320 

lie Pro Phe Gin Gin Ser Thr Asp Ala lie Pro Phe Trp Tyr Pro Arg 
325 330 335 

lie Thr Tyr Glu Glu Gin lie Gly Ser Ser Ser Gly Gly Tyr Met Gly 
340 345 350 

Tyr Val Leu ser Tyr Ala Ala Lys Met Leu Phe Gly lie Asp Asp Val 
355 360 365 

Gly Pro Tyr val ser Met Asn Asn Lys Asn Gly Asp Leu Thr Glu Tyr 
370 375 380 

Thr Leu Arg His Pro Glu Thr Asn Glu Gin Leu lie ser Met Ala Thr 
385 390 395 400 

Cys Tyr Gly Phe Arg Asn lie Gin Asn Leu Val Arg Arg Val His Gly 
405 410 415 

Asn Ser ser val Arg Lys Gly Arg Val Leu Leu Lys Lys Arg val Arg 
420 425 430 

ser Asn Ala Gin Asn Pro Thr Glu Glu Pro Ser Arg Tyr Asp Tyr Val 
435 440 445 

Glu val Met Ala cys Pro Gly Gly cys lie Asn Gly Gly Gly Gin Leu 
450 455 460 

Pro Phe Pro Ser Val Glu Arg lie Val Ser Ala Arg Asp Trp Met Gin 
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Gin val Glu Lys Leu Tyr Tyr Glu Pro Gly Thr Arg Ser val Asp Gin 
485 490 495 

ser Ala val Ser Tyr Met Leu Glu Gin Trp Val Lys Asp Pro Thr Leu 
500 505 510 

Thr Pro Lys Phe Leu His Thr Ser Tyr Arg Ala val Gin Thr Asp Asn 
515 520 525 

Asp Asn Pro Leu Leu Leu Ala Asn Lys Trp 
530 535 

<210> 59 

<211> 119 

<212> PRT 

<213> Metopus contortus 

<400> 59 

lie lie Phe Ala Glu Lys Asn Tyr Pro Glu Met Val Asn His Leu Ser 
1 5 10 15 

Thr Thr Lys Ser Pro Met Gin Met Leu Ser Ser Leu ser Lys Gly Tyr 
20 25 30 

Trp Ala Lys Glu Gly Lys Lys lie Asp Pro Lys Asn val val Asn Val 
35 40 45 

Ala lie Met Pro Cys Thr Ala Lys Lys Ala Trp Lys Glu Arg Pro Asp 
50 55 60 

Met Lys Ala Asp Asn Gly Asp Pro Val Thr Asp Tyr Val Leu Thr Thr 
65 70 75 80 

Arg Glu Leu Gly Thr Met Leu Arg Gin Ser Asn lie Asn Pro val Ser 
85 ~ 90 95 

Leu Pro Lys Thr Pro Phe Asp Lys lie Met Gly Glu Ser Thr Gly Ala 
100 105 110 

Ala Val lie Phe Gly Ala Thr 
115 

<210> 60 

<211> 462 

<212> PRT 

<213> Mus musculus 

<400> 60 

Met Lys Cys Glu His Cys Thr Arg Lys Glu Cys Ser Lys Lys Ser Lys 
15 10 15 

Asn Asp Asp Gin Glu Asn Val Ser Ser Asp Gly Ala Gin Pro Ser Asp 
20 25 30 

Gly Ala Ser Pro Ala Lys Glu ser Glu Glu Lys Gly Glu Phe His Lys 
35 40 45 
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Leu Ala Asp Ala Lys lie Phe Leu Ser Asp cys Leu Ala cys Asp ser 
50 55 60 

cys val Thr Val Glu Glu Gly Val Gin Leu ser Gin Gin ser Ala Lys 
65 70 75 80 

Asp Phe Leu His val Leu Asn Leu Asn Lys Arg cys Asp Thr ser Lys 
85 90 95 

His Arg Val Leu Val Val Ser Val Cys Pro Gin ser Leu Pro Tyr Phe 
100 105 110 

Ala Ala Lys Phe Asn Leu Ser Val Thr Asp Ala Ser Arg Arg Leu Cys 
115 120 125 

Gly Phe Leu Lys Ser Leu Gly Val His Tyr Val Phe Asp Thr Thr lie 
130 135 140 

Ala Ala Asp Phe ser lie Leu Glu Ser Gin Lys Glu Phe val Arg Arg 
145 150 155 160 

Tyr His Gin His Ser Glu Glu Gin Arg Glu Leu Pro Met Leu Thr ser 
165 170 175 

Ala Cys Pro Gly Trp val Arg Tyr Ala Glu Arg Val Leu Gly Arg Pro 
180 185 190 

lie lie Pro Tyr Leu Cys Thr Ala Lys Ser Pro Gin Gin val Met Gly 
195 200 205 

Ser Leu Val Lys Asp Tyr Phe Ala Arg Gin Gin Asn Leu Ser Pro Glu 
210 215 220 

Lys lie Phe His Val Val Val Ala Pro Cys Tyr Asp Lys Lys Leu Glu 
225 230 235 240 

Ala Leu Arg Glu Gly Leu Ser Thr Thr Leu Asn Gly Ala Arg Gly Thr 
245 250 255 

Asp Cys val Leu Thr Ser Gly Glu lie Ala Gin lie Met Glu Gin Ser 
260 265 270 

Asp Leu Ser Val Lys Asp lie Ala Val Asp Thr Leu Phe Gly Asp Met 
275 280 285 

Lys Glu Val Ala Val Gin Arg His Asp Gly Val Ser Ser Asp Gly His 
290 295 300 

Leu Ala His Val Phe Arg His Ala Ala Lys Glu Leu Phe Gly Glu His 
305 310 315 320 

Val Glu Glu lie Thr Tyr Arg Ala Leu Arg Asn Lys Asp Phe His Glu 
325 330 335 

val Thr Leu Glu Lys Asn Gly Glu val Leu Leu Arg Phe Ala Ala Ala 
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S40UJ. S 345 350 

Tyr Gly Phe Arg Asn lie Gin Asn Met lie Gin Lys Leu Lys Lys Gly 
355 360 365 

Lys Leu Pro Tyr His Phe Val Glu val Leu Ala cys Pro Arg Gly cys 
370 375 380 

Leu Asn Gly Arg Gly Gin Ala Gin Thr Glu Asp Gly His Thr Asp Arg 
385 390 395 400 

Ala Leu Leu Gin Gin Met Glu Gly lie Tyr Ser Gly lie Pro val Arg 
405 410 415 

Pro Pro Glu ser ser Thr His val Gin Glu Leu Tyr Gin Glu Trp Leu 
420 425 430 

Glu Gly Thr Glu ser Pro Lys Val Gin Glu Val Leu His Thr ser Tyr 
435 440 445 

Gin Ser Leu Glu Pro Cys Thr Asp Gly Leu Asp lie Lys Trp 
450 455 460 

<210> 61 

<211> 457 

<212> PRT 

<213> caenorhabditis elegans 

<400> 61 

Met Glu Asp ser Gly Phe ser Gly val val Arg Leu ser Asn val Ser 
1 5 10 15 

Asp Phe lie Ala Pro Asn Leu Asp Cys lie lie Pro Leu Glu Thr Arg 
20 25 30 

Thr Val Glu Lys Lys Lys Glu Glu Ser Gin val Asn lie Arg Thr Lys 
35 40 45 

Lys Pro Lys Asp Lys Glu Ser Ser Lys Thr Glu Glu Lys Lys ser val 
50 55 60 

Lys lie ser Leu Ala Asp Cys Leu Ala cys ser Gly Cys lie Thr ser 
65 70 75 80 

Ala Glu Thr val Leu Val Glu Glu Gin ser Phe Gly Arg val Tyr Glu 
85 90 95 

Gly lie Gin Asn Ser Lys Leu Ser val val Thr val Ser Pro Gin Ala 
100 105 110 

lie Thr Ser lie Ala Val Lys lie Gly Lys Ser Thr Asn Glu val Ala 
115 120 125 

Lys lie lie Ala Ser Phe Phe Arg Arg Leu Gly val Lys Tyr Val lie 
130 135 140 

Asp Ser ser Phe Ala Arg Lys Phe Ala His Ser Leu lie Tyr Glu Glu 

Page 102 



WO 2005/072262 PCT/US2005/001983 

tflrrrr /nctiic strut'* fpLO-qi 050118 cip sequence Listing 

Leu Ser Thr Thr Pro Ser Thr Ser Arg Pro Leu Leu Ser Ser Ala Cys 
165 170 175 

Pro Gly Phe val cys Tyr Ala Glu Lys ser His Gly Glu Leu Leu lie 
180 185 190 

Pro Lys lie ser Lys lie Arg Ser Pro Gin Ala lie Ser Gly Ala lie 
195 200 205 

lie Lys Gly Phe Leu Ala Lys Arg Glu Gly Leu Ser Pro cys Asp val 
210 215 ~ 220 

Phe His Ala Ala val Met Pro Cys Phe Asp Lys Lys Leu Glu Ala ser 
225 230 235 240 

Arg Glu Gin Phe Lys Val Asp Gly Thr Asp val Arg Glu Thr Asp cys 
245 250 ~ 255 

val lie ser Thr Ala Glu Leu Leu Glu Glu lie lie Lys Leu Glu Asn 
260 265 270 

Asp Glu Ala Gly Asp Val Glu Asn Arg ser Glu Glu Glu Gin Trp Leu 
275 280 285 

Ser Ala Leu ser Lys Gly ser Val lie Gly Asp Asp Gly Gly Ala ser 
290 295 300 

Gly Gly Tyr Ala Asp Arg lie Val Arg Asp Phe val Leu Glu Asn Gly 
305 310 315 320 

Gly Gly lie Val Lys Thr Ser Lys Leu Asn Lys Asn Met Phe ser Thr 
325 330 335 

Thr val Glu ser Glu Ala Gly Glu lie Leu Leu Arg Val Ala Lys Val 
340 345 350 

Tyr Gly Phe Arg Asn val Gin Asn Leu Val Arg Lys Met Lys Thr Lys 
355 360 365 

Lys Glu Lys Thr Asp Tyr Val Glu lie Met Ala Cys Pro Gly Gly Cys 
370 375 380 

Ala Asn Gly Gly Gly Gin lie Arg Tyr Glu Thr Met Asp Glu Arg Glu 
385 390 395 400 

Glu Lys Leu lie Lys val Glu Ala Leu Tyr Glu Asp Leu Pro Arg Gin 
405 410 415 

Asp Asp Glu Glu Thr Trp lie Lys Val Arg Glu Glu Trp Glu Lys Leu 
420 425 430 

Asp Lys Asn Tyr Arg Asn Leu Leu Phe Thr Asp Tyr Arg Pro val Glu 
435 440 445 
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<210> 62 

<211> 462 

<212> PRT 

<213> Mus musculus 

<400> 62 

Met Lys Cys Glu His cys Thr Arg Lys Glu Cys Ser Lys Lys Ser Lys 
1 " 5 10 ' 15 



Thr Asp Asp Gin Glu Asn Val Ser Ser Asp Gly Ala Gin Pro Ser Asp 
20 25 30 



Gly Ala Ser Pro Ala Lys Glu Ser Glu Glu Lys Gly Glu Phe His 
35 ' 40 45 



Leu Ala Asp Ala Lys lie Phe Leu ser Asp cys Leu Ala cys Asp Ser 
50 55 60 



Cys val Thr Val Glu Glu Gly Val Gin Leu Ser Gin Gin ser Ala Lys 
65 70 75 80 



Asp Phe Leu His Val Leu Asn Leu Asn Lys Arg cys Asp Thr ser Lys 
85 90 ^ 95 



His Arg val Leu val val Ser Val cys Pro Gin ser Leu Pro Tyr Phe 
100 105 110 



Ala Ala Lys Phe Asn Leu Ser val Thr Asp Ala Ser Arg Arg Leu Cys 
115 120 125 



Gly Phe Leu Lys ser Leu Gly Val His Tyr val Phe Asp Thr Thr lie 
130 135 140 



Ala Ala Asp Phe ser lie Leu Glu Ser Gin Lys Glu Phe Val Arg Arg 
145 150 155 160 



Tyr His Gin His Ser Glu Glu Gin Arg Glu Leu Pro Met Leu Thr ser 
165 170 175 



Ala cys Pro Gly Trp Val Arg Tyr Ala Glu Arg val Leu Gly Arg Pro 
180 185 ' 190 



lie lie Pro Tyr Leu Cys Thr Ala Lys Ser Pro Gin Gin Val Met Gly 
195 200 205 



ser Leu val Lys Asp Tyr Phe Ala Arg Gin Gin Asn Leu Ser Pro Glu 
210 215 220 



Lys lie Phe His Val val Val Ala Pro cys Tyr Asp Lys Lys Leu Glu 
225 230 235 * 240 



Ala Leu Arg Glu Gly Leu ser Thr Thr Leu Asn Gly Ala Arg Gly Thr 



245 



250 



255 



Page 104 



WO 2005/072262 PCT/US2005/001983 

w _ • _ , n , lXi 050118 CIP Sequence Listing 

0?§|*^^ Glu He Ala Gin He Met Glu Gin ser 

260 265 270 

Asp Leu Ser Val Lys Asp lie Ala Val Asp Thr Leu Phe Gly Asp Met 
275 280 285 

Lys Glu Val Ala val Gin Arg His Asp Gly Val ser ser Asp Gly His 
290 295 300 

Leu Ala His Val Phe Arg His Ala Ala Lys Glu Leu Phe Gly Glu His 
305 310 315 320 

val Glu Glu lie Thr Tyr Arg Ala Leu Arg Asn Lys Asp Phe His Glu 
325 330 335 

val Thr Leu Glu Lys Asn Gly Glu val Leu Leu Arg Phe Ala Ala Ala 
340 345 350 

Tyr Gly Phe Arg Asn lie Gin Asn Met lie Gin Lys Leu Lys Lys Gly 
355 360 365 

Lys Leu Pro Tyr His Phe Val Glu Val Leu Ala Cys Pro Arg Gly Cys 
370 375 380 

Leu Asn Gly Arg Gly Gin Ala Gin Thr Glu Asp Gly His Thr Asp Arg 
385 390 395 400 

Ala Leu Leu Gin Gin Met Glu Gly lie Tyr Ser Gly lie Pro val Arg 
405 410 ' 415 

Pro Pro Glu ser ser Thr His Val Gin Glu Leu Tyr Gin Glu Trp Leu 
420 425 430 

Glu Gly Thr Glu Ser Pro Lys val Gin Glu Val Leu His Thr Ser Tyr 
435 440 445 

Gin Ser Leu Glu Pro Cys Thr Asp Gly Leu Asp lie Lys Trp 
450 455 460 

<210> 63 
<211> 119 
<212> PRT 

<213> Neocal 1 i masti x 
<400> 63 

lie Met Phe Ala Glu Lys Asn Phe Pro Asp Met val Asn Asn Leu Ser 
15 10 15 

Thr Thr Lys Ser Pro Met Gin Met Leu ser Ser Leu Thr Lys Gly Tyr 
20 25 30 

Trp Ala Lys Asp lie Lys Lys lie Asn Pro Lys Asp val Val Asn Val 
35 40 45 

Ala lie Met Pro Cys Thr Ala Lys Lys Gin Glu Lys Asp Arg Pro Gly 
50 55 60 
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P Sit/LylSrhl "ity 3sp Ly^Val 1 Th^A^Phe^vaV Leu Thr Thr 
65 70 75 80 

Arg Glu Leu Gly Met Met Leu Arg Gin Ala Asn lie Asp Pro Thr Lys 
85 ~ 90 95 

Leu Pro Gly Thr Lys Phe Asp Lys val Met Gly Glu ser Thr Gly Ala 
100 105 ' 110 

Ala Val lie Phe Gly Ala Thr 
115 

<210> 64 

<211> 119 

<212> PRT 

<213> Nyctotherus oval is 

<400> 64 

lie lie Phe Met Glu Lys Asn Tyr Pro Asp Met Leu Ser His Leu Ser 
1 5 10 15 

Thr Cys Lys Ser Pro Gin Gly Met Leu Gly Ala Leu lie Lys Gly Tyr 
20 25 30 

Trp Ala Lys Lys val Lys Lys val Asp Pro Lys Asp val val Ser val 
35 40 45 

ser lie Met Pro cys Thr Ala Lys Lys Ala Glu Lys Glu Arg Pro Gin 
50 55 60 

Leu Arg Gly Asp Glu Gly Phe Lys Asp val Asp Tyr val Leu Thr Thr 
65 70 75 80 

Arg Glu Leu Ala Lys Met Leu Lys Gin Ser Asn lie Asp Leu Gly Lys 
85 90 95 

val Glu Pro Thr Pro Phe Asp Ala val Met ser Glu Gly Thr Gly Ala 
100 105 110 

Ala val lie Phe Gly val Thr 
115 

<210> 65 

<211> 490 

<212> PRT 

<213> Clostridium perfringens 

<400> 65 

Met Ala lie Lys Asp Ala Asn Lys Gin Tyr lie Lys Phe Asp Thr Ala 
1 5 10 ' 15 

Val Gin Val Leu Lys Tyr Glu Val Leu Lys Arg He Ala Glu Lys Glu 
20 25 30 

Phe Asp Gly Thr Leu Asp Lys Glu Lys Leu Asn lie Ala Lys Glu lie 
35 40 45 

Val Asp Asp Leu Lys Pro Asn Val Arg cys cys lie Tyr Lys Glu Arg 
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Ala lie Val Glu Glu Arg Met Lys Leu Ala Leu Gly Gly His Glu Asn 
65 70 75 80 

Arg Glu Asn Met He Glu Val lie Asp lie Ala Cys Asp Glu Cys Pro 
85 90 95 

Val Asn Arg Phe lie Val Thr Asp Ala Cys Arg Gly cys Leu Ala Lys 
100 105 110 

Lys cys Arg Asp ser cys Asn Phe Gly Ala lie Ser Phe Asp Asn Arg 
115 120 125 

Lys cys Lys lie Asp Tyr Glu Lys Cys Lys Glu Cys Gly Lys Cys Lys 
130 135 140 

Glu Val Cys Pro Tyr Asn Ala lie Ala Glu val Lys Arg Pro Cys Met 
145 150 155 160 

Arg Ala cys lie Pro Lys Ala Leu ser Tyr Asp val Asp ser Lys Lys 
165 170 175 

Ala Val lie Asp Asp Ser Lys Cys lie Gin Cys Gly Ala cys Val Val 
180 185 190 

Asp cys Pro Phe Gly Ala lie Met Asp Lys Ser Tyr Leu Val Asp Val 
195 200 205 

lie Arg Leu Leu Lys Asp Glu Lys Lys Val Tyr Ala lie val Ala Pro 
210 215 220 

Ala lie Ser Ser Gin Phe Asn His ser Lys lie Gly Lys Val lie Thr 
225 230 235 240 

Ala lie Lys Lys Leu Gly Phe Glu Asp val Phe Glu Ala Ala Leu Gly 
245 250 255 

Ala Asp Leu val Ala Val His Glu Cys Asn Glu Phe Lys Glu Lys Gly 
260 265 270 

Glu Leu Asp Phe Met Thr Thr Ser cys cys Pro Ala Phe val ser Tyr 
275 280 285 

lie Glu Lys Asn Tyr Pro Glu Leu Lys Glu Cys lie Ser Asn Thr val 
290 295 300 

ser Pro Met val Ala Met Ala Arg Leu lie Lys ser Gin Asn Lys Asp 
305 310 315 320 

Val Lys Thr Val Phe lie Gly Pro Cys lie Ala Lys Lys Thr Glu Ala 
325 330 335 

Lys Arg Asn Glu val Ser Gly Asp val Asp Tyr val Leu Thr Phe Glu 
340 345 350 
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,IM Sell ?ila 5e2t 'iei 3sp° Ser" Arg 1 As S n e Ti e e nC Lys L iVe Asp Gl u cys 
355 360 365 

Glu Glu ser Asp Thr Lys His Gly ser Phe Tyr Gly Arg Leu Phe Ala 
370 375 380 

Arg ser Gly Gly Val Thr Glu ser Val Lys His Leu He Asp Ser Glu 
385 390 395 400 

Gly lie Lys Val Asp Phe Arg Pro lie Leu Gly Asp Gly lie Lys Asp 
405 410 415 

Cys Asp lie Lys Leu Arg Leu Ala Lys Leu Lys Arg Ala Gin Gly Asn 
420 425 430 

Phe Leu Glu Gly Met Ala cys Lys Gly Gly cys lie Asn Gly Pro Gly 
435 440 445 

ser Leu Asn His Asp lie Lys Asn ser Lys Glu val Asp Lys Tyr Gly 
450 455 460 

Glu Leu ser Ser ser Glu Lys lie Lys Asp Thr Leu Ala Asp lie Lys 
465 470 475 480 

Phe Glu Asp Leu Asn Leu ser Lys Asn Glu 
485 490 

<210> 66 

<211> 456 

<212> PRT 

<213> Homo sapiens 

<400> 66 

Met Lys cys Glu His cys Thr Arg Lys Glu Cys ser Lys Lys Thr Lys 
15 10 15 

Thr Asp Asp Gin Glu Asn val ser Ala Asp Ala Pro ser Pro Ala Gin 
20 25 30 

Glu Asn Gly Glu Lys Gly Glu Phe His Lys Leu Ala Asp Ala Lys lie 
35 40 45 

Phe Leu Ser Asp Cys Leu Ala Cys Asp ser Cys Met Thr Ala Glu Glu 
50 55 60 

Gly Val Gin Leu Ser Gin Gin Asn Ala Lys Asp Phe Phe Arg Val Leu 
65 70 75 80 

Asn Leu Asn Lys Lys Cys Asp Thr ser Lys His Lys Val Leu Val val 
85 90 95 

Ser Val cys Pro Gin ser Leu Pro Tyr Phe Ala Ala Lys Phe Asn Leu 
100 105 110 

Ser Val Thr Asp Ala Ser Arg Arg Leu cys Gly Phe Leu Lys Ser Leu 
115 120 125 
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•irT/Jl^JTUR /iiilti »u» ci»T3i 05 9 118 I CIP Sequence Listing 
O^WI^i'^nry^ VOT"^flte*A»p Thr Thr He Ala Ala Asp Phe Ser I 
130 135 140 



Leu Glu ser Gin Lys Glu Phe Val Arg Arg Tyr Arg Gin His Ser Glu 
145 150 155 " 160 

Glu Glu Arg Thr Leu Pro Met Leu Thr ser Ala cys Pro Gly Trp Val 
165 170 175 

Arg Tyr Ala Glu Arg val Leu Gly Arg Pro lie Thr Ala His Leu Cys 
180 ^ 185 190 

Thr Ala Lys Ser Pro Gin Gin Val Met Gly ser Leu val Lys Asp Tyr 
195 200 205 

Phe Ala Arg Gin Gin Asn Leu Ser Pro Glu Lys lie Phe His val lie 
210 215 220 

Val Ala Pro Cys Tyr Asp Lys Lys Leu Glu Ala Leu Gin Glu Ser Leu 
225 230 235 240 

Pro Pro Ala Leu His Gly Ser Arg Gly Ala Asp Cys val Leu Thr Ser 
245 250 255 

Gly Glu lie Ala Gin lie Met Glu Gin Gly Asp Leu ser val Arg Asp 
260 265 ' 270 

Ala Ala val Asp Thr Leu Phe Gly Asp Leu Lys Glu Asp Lys val Thr 
275 280 285 

Arg His Asp Gly Ala Ser Ser Asp Gly His Leu Ala His lie Phe Arg 
290 295 300 

His Ala Ala Lys Glu Leu Phe Asn Glu Asp val Glu Glu val Thr Tyr 
305 310 315 320 

Arg Ala Leu Arg Asn Lys Asp Phe Gin Glu Val Thr Leu Glu Lys Asn 
325 330 335 

Gly Glu val Val Leu Arg Phe Ala Ala Ala Tyr Gly Phe Arg Asn lie 
340 345 350 

Gin Asn Met lie Leu Lys Leu Lys Lys Gly Lys Phe Pro Phe His Phe 
355 360 365 

Val Glu val Leu Ala cys Ala Gly Gly cys Leu Asn Gly Arg Gly Gin 
370 375 380 

Ala Gin Thr Pro Asp Gly His Ala Asp Lys Ala Leu Leu Arg Gin Met 
385 390 395 400 

Glu Gly lie Tyr Ala Asp lie Pro Val Arg Arg Pro Glu Ser Ser Ala 
405 410 415 

His Val Gin Glu Leu Tyr Gin Glu Trp Leu Glu Gly lie Asn Ser Pro 
420 425 430 
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Lys Ala Arg Glu Val Leu His Thr Thr Tyr Gin ser Gin Glu Arg Gly 
435 440 445 

Thr His ser Leu Asp lie Lys Trp 
450 455 

<210> 67 

<211> 408 

<212> PRT 

<213> Homo sapiens 

<400> 67 

Met Lys cys Glu His Cys Thr Arg Lys Glu cys ser Lys Lys Thr Lys 
1 5 10 15 

Thr Asp Asp Gin Glu Asn Val ser Ala Asp Ala Pro ser Pro Ala Gin 
20 25 30 

Glu Asn Gly Glu Lys cys Asp Thr ser Lys His Lys Val Leu Val Val 
35 40 45 

Ser val cys Pro Gin ser Leu Pro Tyr Phe Ala Ala Lys Phe Asn Leu 
50 55 60 

Ser val Thr Asp Ala Ser Arg Arg Leu Cys Gly Phe Leu Lys ser Leu 
65 70 75 80 

Gly val His Tyr Val Phe Asp Thr Thr lie Ala Ala Asp Phe Ser lie 
85 90 95 

Leu Glu Ser Gin Lys Glu Phe val Arg Arg Tyr Arg Gin His Ser Glu 
100 105 110 

Glu Glu Arg Thr Leu Pro Met Leu Thr Ser Ala Cys Pro Gly Trp Val 
115 120 125 

Arg Tyr Ala Glu Arg Val Leu Gly Arg Pro lie Thr Ala His Leu Cys 
130 135 140 

Thr Ala Lys ser Pro Gin Gin Val Met Gly Ser Leu val Lys Asp Tyr 
145 150 155 160 

Phe Ala Arg Gin Gin Asn Leu Ser Pro Glu Lys lie Phe His val lie 
165 170 175 

Val Ala Pro cys Tyr Asp Lys Lys Leu Glu Ala Leu Gin Glu Ser Leu 
180 185 190 

Pro Pro Ala Leu His Gly ser Arg Gly Ala Asp Cys Val Leu Thr Ser 
195 200 205 

Gly Glu lie Ala Gin lie Met Glu Gin Gly Asp Leu ser Val Arg Asp 
210 215 220 

Ala Ala Val Asp Thr Leu Phe Gly Asp Leu Lys Glu Asp Lys val Thr 
225 230 235 240 

Page 110 



WO 2005/072262 PCT/US2005/001983 

050118 CIP Sequence Listing 

Arg His Asp Gly Ala Ser Ser Asp Gly His Leu Ala His lie Phe Arg 
245 250 255 

His Ala Ala Lys Glu Leu Phe Asn Glu Asp Val Glu Glu Val Thr Tyr 
260 265 270 

Arg Ala Leu Arg Asn Lys Asp Phe Gin Glu val Thr Leu Glu Lys Asn 
275 280 285 

Gly Glu val val Leu Arg Phe Ala Ala Ala Tyr Gly Phe Arg Asn lie 
290 295 300 

Gin Asn Met lie Leu Lys Leu Lys Lys Gly Lys Phe Pro Phe His Phe 
305 310 315 320 

val Glu val Leu Ala cys Ala Gly Gly cys Leu Asn Gly Arg Gly Gin 
325 330 335 

Ala Gin Thr Pro Asp Gly His Ala Asp Lys Ala Leu Leu Arg Gin Met 
340 345 350 

Glu Gly lie Tyr Ala Asp lie Pro Val Arg Arg Pro Glu ser ser Ala 
355 360 365 

His val Gin Glu Leu Tyr Gin Glu Trp Leu Glu Gly lie Asn ser Pro 
370 375 380 

Lys Ala Arg Glu Val Leu His Thr Thr Tyr Gin Ser Gin Glu Arg Gly 
385 390 395 400 

Thr His Ser Leu Asp lie Lys Trp 
405 

<210> 68 

<211> 502 

<212> PRT 

<213> Homo sapiens 

<400> 68 

Met Lys Cys Glu His Cys Thr Arg Lys Glu Cys Ser Lys Lys Thr Lys 
1 5 10 15 

Thr Asp Asp Gin Glu Asn Val Ser Ala Asp Ala Pro ser Pro Ala Gin 
20 25 30 

Glu Asn Gly Glu Lys Gly Glu Phe His Lys Leu Ala Asp Ala Lys lie 
35 40 45 

Phe Leu ser Asp Cys Leu Ala cys Asp ser cys Met Thr Ala Glu Glu 
50 55 60 

Gly Val Gin Leu ser Gin Gin Asn Ala Lys Asp Phe Phe Arg Val Leu 
65 70 75 80 

Asn Leu Asn Lys Lys Cys Asp Thr ser Lys His Lys Val Leu val Val 
85 90 95 
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ser vai cys Pro Gin Ser Leu Pro Tyr Phe Ala Ala Lys Phe Asn Leu 
100 105 110 

Ser val Thr Asp Ala Ser Arg Arg Leu Cys Gly Phe Leu Lys ser Leu 
115 120 125 

Gly val His Tyr Val Phe Asp Thr Thr lie Ala Ala Asp Phe ser lie 
130 135 140 

Leu Glu Ser Gin Lys Glu Phe val Arg Arg Tyr Arg Gin His Ser Glu 
145 150 " 155 ~ 160 

Glu Glu Arg Thr Leu Pro Met Leu Thr ser Ala Cys Pro Gly Trp Val 
165 170 175 

Arg Tyr Ala Glu Arg val Leu Gly Arg Pro lie Thr Ala His Leu Cys 
180 185 190 

Thr Ala Lys Ser Pro Gin Gin val Met Gly Ser Leu Val Lys Asp Tyr 
195 200 205 

Phe Ala Arg Gin Gin Asn Leu Ser Pro Glu Lys lie Phe His val lie 
210 215 220 

val Ala Pro Cys Tyr Asp Lys Lys Leu Glu Ala Leu Gin Glu ser Leu 
225 230 235 240 

Pro Pro Ala Leu His Gly ser Arg Gly Ala Asp cys Val Leu Thr Ser 
245 250 255 

Glu He ser Gin Ala Trp Trp Cys Thr Pro Val lie Thr Ala Thr Arq 
260 265 270 

Glu Ala Ala Ala Arg Glu ser Leu Glu Pro Gly Arg Gin Arg Leu Gin 
275 280 285 

Arg Asp Lys He Ala Pro Leu Asp Ser Ser Leu Gly Gly Gly Gly Glu 
290 295 300 

lie Ala Gin lie Met Glu Gin Gly Asp Leu Ser Val Arg Asp Ala Ala 
305 310 315 " 320 

Val Asp Thr Leu Phe Gly Asp Leu Lys Glu Asp Lys Val Thr Arq His 
325 330 335 

Asp Gly Ala ser ser Asp Gly His Leu Ala His lie Phe Arg His Ala 
340 345 350 

Ala Lys Glu Leu Phe Asn Glu Asp Val Glu Glu Val Thr Tyr Arq Ala 
355 360 365 

Leu Arg Asn Lys Asp Phe Gin Glu Val Thr Leu Glu Lys Asn Gly Glu 
370 375 380 

val Val Leu Arg Phe Ala Ala Ala Tyr Gly Phe Arg Asn lie Gin Asn 
385 390 395 400 
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Met lie Leu Lys Leu Lys Lys Gly Lys Phe Pro Phe His Phe Val Glu 
405 410 415 

Val Leu Ala Cys Ala Gly Gly cys Leu Asn Gly Arg Gly Gin Ala Gin 
420 425 430 

Thr Pro Asp Gly His Ala Asp Lys Ala Leu Leu Arg Gin Met Glu Gly 
435 440 445 

lie Tyr Ala Asp lie Pro Val Arg Arg Pro Glu ser Ser Ala His Val 
450 455 ~ 460 

Gin Glu Leu Tyr Gin Glu Trp Leu Glu Gly lie Asn ser Pro Lys Ala 
465 470 475 480 

Arg Glu val Leu His Thr Thr Tyr Gin ser Gin Glu Arg Gly Thr His 
485 490 495 

Ser Leu Asp lie Lys Trp 
500 

<210> 69 
<211> 448 
<212> PRT 

<213> Clostridium tetani 
<400> 69 

Met His Asn Asp Tyr Arg Glu lie Phe Lys Arg Leu Ser Lys Ser Tyr 
1 5 10 15 

Tyr Asp Asp Thr Phe Glu Lys Glu val Glu Asn lie Leu Ser ser His 
20 25 30 

Ser Met Asp Arg Glu Lys Leu Ala Lys lie lie ser lie Leu Cys Gly 
35 40 45 

Val Asn lie Glu His ser Glu Asn Tyr lie ser Asn Leu Lys Asn Ala 
50 55 60 

lie Lys Asn Tyr Thr Ala Ser Ala Glu Lys Val val Thr Lys Leu Pro 
65 70 75 80 

Cys Ser Thr Gin Cys Ala Lys Asp Gly Asp lie lie Cys Glu Lys Ser 
85 90 95 

cys Pro val Asn Ala lie Phe Arg Asp Pro Asn Asp Asn Asn lie Tyr 
100 105 110 

lie Asn Asp Glu Leu cys Leu Asp cys Gly Leu Cys Val Arg Asn Cys 
115 120 125 

Pro ser Gly ser lie Leu Asp Lys Lys Glu Phe lie Pro Leu Ala Glu 
130 135 140 

Leu Leu Lys Ser Glu Ser lie Val lie Ala Ala Val Ala Pro Ala lie 
145 150 155 160 
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Met Gly Gin Phe Gly Glu Asn Thr Thr lie Asn Gin Leu Arg Thr Ala 
165 170 175 

Phe Lys Lys Leu Gly Phe Thr Asp Met Val Glu Val Ala Phe Phe Ala 
180 185 190 

Asp Met Leu Thr Leu Lys Glu Ala Val Glu Tyr Asp His Phe Val Lys 
195 200 205 

Asp Glu Gin Asp Phe Met lie Thr ser Cys Cys cys Pro Met Trp val 
210 215 220 

Gly Met Leu Lys Lys Val Tyr Asn Asp Leu Val Lys Tyr Val Ser Pro 
225 230 235 240 

ser val ser Pro Met lie Ala Ala Gly Arg val Leu Lys Leu Leu Asn 
245 250 255 

Pro Asn cys Lys val val Phe val Gly Pro Cys lie Ala Lys Lys Ala 
260 265 270 

Glu Ala Arg Glu Lys Asp Leu Leu Gly Asp lie Asp Phe val Leu Thr 

275 280 285 

Phe Thr Glu Leu Arg Asp lie Phe Asp Val Phe Asp lie Gin Pro Glu 
290 295 300 

Asn Leu Glu Glu Asp Phe Ser Ser Glu Tyr Ala Ser Lys Gly Gly Arg 
305 310 315 320 

Leu Tyr Ala Arg Thr Gly Gly val ser lie Ala val Ser Glu Ala lie 
325 330 335 

Glu Lys Leu Phe Pro Asn Lys Tyr Lys Phe Leu Lys Thr lie Gin Ala 
340 345 350 

Asp Gly val Lys Gly cys Lys ser Leu Leu Asp Lys lie Lys Gin Glu 
355 360 365 

Asp lie Ser Ala Asn Phe Val Glu Gly Met Gly Cys val Gly Gly Cys 
370 375 380 

Val Gly Gly Pro Lys Val lie lie Asp Pro ser Glu Gly Arg Asn Ala 
385 390 395 ~ 400 

Val Asn Asn Phe Ala Glu Asn ser ser lie Lys val Ser val Asp ser 
405 410 415 

Asn Cys Met Asn Asp lie Leu ser Lys lie Asn lie Asn ser val Glu 
420 425 430 

Asp Phe Lys Asp Lys Asp Lys lie Ser lie phe Glu Arg Glu Phe Lys 
435 440 445 

<210> 70 
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<212> PRT 

<213> Desulfovibrio desulfuri cans 
<400> 70 

Met Tyr Phe Arg Thr Tyr Asp Asn Thr lie Asn Phe Glu He Met val 
1 5 10 15 

Arg lie Ala Lys Ala Phe His Gly Asp ser Phe Glu Glu Gin val Ala 
20 25 30 

Arg lie Pro Leu Glu Met Arg Pro Arg Lys Ala His ser ser Arg Cys 
35 40 45 

Cys lie Tyr Arg Asp Arg Ala lie lie Arg Tyr Arg Cys Met Ala Met 
50 55 60 

Leu Gly Tyr Ala lie Glu Asp Glu Thr Asp Glu Leu Thr ser Leu ser 
65 70 75 80 

Gin Tyr Ala Lys Gly Ala Leu Glu Arg Asp ser lie Gin Gly Ser Met 
85 90 95 

Leu Thr Phe lie Asp Glu Ala Cys Asn Gly Cys Val Arg Thr His Tyr 
100 105 110 

Glu Ala Thr ser Ala Cys Arg Gly cys Leu Ala Glu Ala cys val Gin 
115 120 125 

His cys Pro Lys Asp Ala Val Arg lie Val Asp Gly Lys ser Arg lie 
130 135 ~ 140 

Asp Pro Asp Lys cys Val Gin cys Gly Lys cys Met Asn Val Cys Pro 
145 150 155 160 

Tyr His Ala lie val Gin lie Pro lie Pro Cys Glu Glu Ser cys Pro 
165 170 175 

Thr Gly Ala lie ser Lys Asp Glu Cys Gly Lys Gin Val lie Asp Tyr 
180 185 190 

Asp Arg Cys lie Phe Cys Gly Lys cys Met Ala Ala Cys Pro Phe Ala 
195 20O 205 

Ala Val Leu Glu Lys ser Gin Met lie Asp val Leu Arg Arg lie Arg 
210 215 220 

Glu Gly Arg Lys Val Val Ala lie Val Ala Pro Ala lie Ala Gly Gin 
225 230 235 240 

val Gin Ala Pro Met ser Arg Leu Ala Thr Ala Leu Arg Gin Leu Gly 
245 250 255 

Phe Ala Asp val Ala Glu val Ala ser Gly Ala Asp Thr Thr Ala Arg 
260 265 270 

Leu Glu Ala Asp Glu Phe val Glu Arg Met Glu His Gly Ala Ala Phe 
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280 285 



Met Thr ser ser cys cys Pro Ala Tyr Thr Gin Leu val Asp Lys His 
290 295 300 



Leu Pro Glu Leu Ala Pro Phe Val Ser Asp Thr Arg Thr Pro Met His 
305 310 315 320 



Tyr Thr Ala Ala Met Val Lys Asp His Asp Pro Asp Met Val Thr val 
325 330 335 



Phe lie Gly Pro cys val Ala Lys Arg Asn Glu Gly Lys His Asp Glu 
340 345 350 



Leu val Asp His val Leu Thr Phe Gin Glu Met Val Ala Met Leu Thr 
355 360 365 



Ala Ala Gly lie Ser val Asp Ala cys Glu Asp Gly Arg Phe Met Phe 
370 375 380 



Pro Ala Met Arg Glu Gly Arg Ser Phe Pro Val Ser Gly Gly Val Thr 
385 390 ~ 395 400 



Ala Gly val Gin Ala His lie Gly Thr Arg Ala Glu val Arg Pro Leu 
405 410 415 



Ser Val Asp Gly Leu Asn Lys Lys Thr Phe Arg Gin Leu Lys Thr Trp 
420 425 " 430 



Ala Lys Lys Gly cys Glu Gly Asn Phe Val Glu val Met Gly cys Gin 
435 440 445 



Gly Gly cys Val Ala Gly Pro Ala lie Val Met 



<210> 71 
<211> 494 
<212> PRT 

<213> Clostridium tetani 
<400> 71 

Met lie Val Phe Glu Asn Gin Leu Lys Lys Leu Lys Tyr Leu Val Leu 
15 10 15 



Lys Glu Val Ala Lys Met Thr Leu Glu Asp Arg Leu Gly Glu Glu Asp 
20 25 30 



lie Glu Arg lie ser Phe Asp lie lie Lys Gly Asp Lys Ala Glu Tyr 
35 40 45 



Arg Cys Cys Val Tyr Lys Glu Arg Ala lie val Tyr Glu Arg Ala Lys 
50 55 60 



Leu Ala Thr Gly Cys Leu Pro Asn Gly Gin Val Ala Glu Glu Phe Val 
65 70 75 80 



His val Glu Asp Asp Asp Gin lie lie Tyr val lie Asp Ala Ala cys 



450 



455 



Page 116 



WO 2005/072262 PCT/US2005/001983 

lww . 050118 CIP Sequence Listing 

FCT/US05/0b*<BB3 90 95 

Asp Lys cys Pro He Asn Lys Tyr Val Val Thr Glu Ala cys Arg Gly 
100 105 110 

cys Leu Gin His Lys cys Met Glu Val Cys Pro Ala Gly Ser lie Asn 
115 120 125 

Arg Ala Ala Gly Lys Ala Tyr lie Asn His Glu Thr cys Lys Glu cys 
130 135 140 

Gly Leu cys Glu ser Ala Cys Pro Tyr Asn Ala lie Ala Glu Val Met 
145 150 155 160 

Arg Pro cys Arg Arg Ala Cys Pro Thr Gly Ala Leu Gin Met Asn Leu 
165 170 175 

Glu Asp Asn Lys Ala Thr lie Asn Lys Glu Asp Cys lie Asn cys Gly 
180 185 190 

Ser cys Met ser val Cys Pro Phe Gly Ala lie Ser Asp Lys Ser Tyr 
195 200 205 

lie Val Asp lie Thr Lys Ala Leu Lys Asn Asn Lys Lys val Tyr Ala 
210 215 220 

Met Val Ala Pro Ala lie Thr Gly Gin Phe Gly Lys Asp val ser Val 
225 230 235 240 

Gly Lys Met Lys Asn Ala Phe Lys Ala Met Gly Phe Glu Asp Met Leu 
245 250 255 

Glu Val Ala cys Gly Ala Asp Ala Val Ala Ala His Glu Ser Glu Glu 
260 265 270 

Phe lie Glu Arg Leu Glu ser Gly Lys Lys Tyr Met Thr Thr ser Cys 
275 280 285 

Cys Pro Gly Phe Leu Gly Tyr lie Glu Lys Lys Phe Pro Asp Gin Leu 
290 295 300 

Glu Asn Val Ser Asn Thr Val Ser Pro Met Val Ala lie Gly Arg Met 
305 310 315 320 

lie Lys Lys Glu Tyr Glu Asp Ser Val Val Val Phe Val Gly Pro Cys 
325 330 335 

Thr Ala Lys Lys Ala Glu lie Lys Arg Lys Gly lie Lys Asp Ala val 
340 345 350 

Asp Tyr val Met Thr Phe Glu Glu lie Ala Ala Leu Met Gly Ala Phe 
355 360 365 

Glu lie Asp Pro Ala Glu Cys Glu Glu Glu Asp lie Asn Asp Gly ser 
370 375 380 
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PBSlTTVtVBIdlS-^ Q^fiMrfAja Gin Gly Gly Gly val val ser Ala He 
385 390 395 400 

Gin Asn Cys lie Lys Asp Lys Glu Gly lie Lys Phe Asn Pro Leu Arg 
405 410 415 

Val ser Gly Pro Asp Gin lie Lys Arg Ala Men lie Met Ala Lys Val 
420 425 430 

Gly Lys Leu ser Glu Asn Phe lie Glu Gly Met Met cys Glu Gly Gly 
435 440 445 

cys He Gly Gly Pro Ala Thr Met val Ser Ala val Lys Ala Lys Ala 
450 455 460 

Pro Leu Met Lys Phe Ser Lys ser ser Thr lie Lys Asp val Lys Asp 
465 470 475 480 

Asn Glu val Leu Asp Lys Tyr Lys Asp lie Asn Met Glu Arg 
485 490 

<210> 72 
<211> 203 
<212> PRT 

<213> Arabidopsis thai i ana 
<400> 72 

Met Asp Leu lie Lys Leu Lys Gly val Asp Phe Lys Asp Leu Glu Glu 
1 5 10 15 

ser Pro Leu Asp Arg Val Leu Thr Asn val Thr Glu Glu Gly Asp Leu 
20 25 30 

Tyr Gly Val Ala Gly Ser Ser Gly Gly Tyr Ala Glu Thr lie Phe Arg 
35 40 45 

His Ala Ala Lys Ala Leu Phe Gly Gin Thr lie Glu Gly Pro Leu Glu 
50 55 60 

Phe Lys Thr Leu Arg Asn Ser Asp Phe Arg Glu val Thr Leu Gin Leu 
65 70 75 80 

Glu Gly Lys Thr val Leu Lys Phe Ala Leu cys Tyr Gly Phe Gin Asn 
85 90 95 

Leu Gin Asn lie val Arg Arg Val Lys Thr Arg Lys Cys Asp Tyr Gin 
100 ~ 105 110 

Tyr Val Glu lie Met Ala Cys Pro Ala Gly cys Leu Asn Gly Gly Gly 
115 120 125 

Gin lie Lys Pro Lys Thr Gly Gin ser Gin Lys Glu Leu lie His ser 
130 135 140 

Leu Glu Ala Thr Tyr Met Asn Asp Thr Thr Leu Asn Thr Asp Pro Tyr 
145 150 155 160 
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165 170 175 

Gly ser Asn Glu Ala Lys Lys Tyr Leu His Thr Gin Tyr His Pro Val 
180 185 190 

val Lys Ser val Thr Ser Gin Leu Asn Asn Trp 
195 200 

<210> 73 

<211> 449 

<212> PRT 

<213> Clostridium perfringens 

<400> 73 

Met Asn Lys Lys Tyr Asn ser Leu Phe Lys Glu Leu lie Ser Ser Tyr 
1 5 10 15 

Tyr ser Glu Asp Asn Phe Asp Glu Lys Leu Asn Asp lie Val Lys Asn 
20 25 30 

Asn Phe Asn Ser Lys Glu Asp Ala lie Glu Val Leu Ser Ser Leu Cys 
35 40 45 

Gly val Asp He Asp Lys Asn Ser Asp Asn lie Ala Tyr Asp lie Arg 
50 55 60 

Lys Ala lie Thr Thr His Lys lie Lys Lys Asn lie val Asp Lys Val 
65 70 75 80 

Ser val Cys Thr Lys Asn cys Ser Lys Glu Ser Lys Gly Lys cys Gin 
85 90 95 

Ser Leu Cys Pro Phe Asp Ala lie Leu Thr Asp Pro lie Asp Asn Ser 
100 105 110 

Lys Tyr lie Asp Pro Asn Leu cys Gin Asn cys Gly lie cys val Gin 
115 120 125 

Val cys Glu ser Gly His Phe Leu Asp Arg lie Glu Leu Leu Pro lie 
130 135 140 

lie Asp Leu lie Lys Asn Asn Glu Thr val lie Ala Ala Val Ala Pro 
145 150 155 160 

Ala lie Ala Gly Gin Phe Gly Glu Asn Val Ser Leu Asp Met Leu Arq 
165 170 175 

Glu Ala Phe lie Lys lie Gly Phe ser Asp Met lie Glu val Ala Phe 
180 185 190 

Ala Ala Asp Met Leu ser lie Lys Glu Ala Val Glu Phe Asn His His 
195 200 205 

Val Glu Lys Thr Gly Asp lie Leu lie Thr Ser cys cys Cys Pro Met 
210 215 220 
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mVym Sli »- EM. IT® as cys Tyr Lys Asp Leu val Lys Asp val 
225 230 235 240 

Ser Pro ser val Ser Pro Met lie Ala Ala Gly Arg val lie Lys Lys 
245 250 255 

Leu Asn Lys Asp Ala Lys Val val Phe lie Gly Pro cys lie Ala Lys 
260 265 270 

Lys Ala Glu Ala Arg Glu Lys Asp Leu Val Gly Ala lie Asp Tyr Val 
275 280 285 

Leu Thr Phe Glu Glu Leu Asn Gly lie Phe Glu Ala Leu Lys lie Asp 
290 295 300 

Pro ser ser Met Lys Gly Val Pro Ser lie Glu Tyr Thr Ser Arg Gly 
305 310 315 320 

Gly Arg Leu Tyr Ala Arg Thr Gly Gly val ser Glu Ala xle Asn Asp 
325 330 335 

Val Val Lys Glu Leu Tyr Pro Asp Lys Ala Lys lie Phe Lys Ala Val 
340 345 350 

Gin Ala Asn Gly Val Lys Glu Cys Lys Glu Leu Leu Asn Lys Val Gin 
355 360 365 

ser Gly Glu Leu Lys Ala Asn Phe lie Glu Gly Met Gly cys val Gly 
370 375 380 

Gly cys val Gly Gly Pro Lys Arg lie val Asp Pro Ser lie Gly Lys 
385 390 395 400 

Lys His Val Asp Glu val Ala Tyr Asn ser Pro lie Lys val Ala Thr 
405 410 415 

His Ser His Thr Met Asp Glu Val Leu Leu Arg Leu Gly xle Asn ser 
420 425 430 

Leu Lys ser Phe Glu Asp Lys Glu Lys lie Ser lie Phe Glu Arg Glu 
435 440 445 

Phe 



<210> 74 

<211> 359 

<212> PRT 

<213> Desulfitobacterium hafniense 

<400> 74 

Met Ala Gin ser Glu lie Met Lys lie Arg Arg Gin Val Leu Lys Ser 
1 5 10 15 

Ala Leu Asp Trp val ser His Asp Gin Asn Arg Lys Asp Arg Ala Thr 
20 25 ~ 30 
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Hlfeu teUI^QS.fiaJ:ia.flb3 Asp Gly Thr Pro Arg Tyr Arg cys cys 
35 40 45 

lie His Lys Glu Arg Ala Val lie Glu Glu Arg Leu Lys Ala val Leu 
50 55 60 

Glu Pro Asp Glu Gly Pro lie Val Arg Val Leu Lys Glu Gly cys Asn 
65 70 75 80 

Gly cys Glu Met His Arg Tyr Ser Val Thr Asp His cys Gin Asn cys 
85 90 95 

Val Gly His Phe Cys Phe Thr Asn cys Pro Lys Lys Ala lie Leu Phe 
100 105 110 

lie Asn Asn Lys Ala Phe lie Asp Gin Thr Arg cys val Glu cys Gly 
115 120 125 

Leu cys Ala Arg Asn cys Pro Tyr His Ala lie lie Glu Tyr Arg Arg 
130 135 140 

Pro cys Glu Asp ser cys Pro Thr Lys Ala lie ser val Arg Glu Asp 
145 150 155 160 

Arg lie Ala Ser lie Ala Glu Ala His Cys Thr Ser cys Gly Lys cys 
165 170 175 

lie lie Ser Cys Pro Phe Gly Ala Val Ala Glu Ser Ser Gin Leu lie 
180 185 190 

His Leu Phe Glu Ala Val Arg Asn Pro Glu His Lys lie Tyr Ala Val 
195 200 205 

lie Ala Pro Ala Phe Val Gly Gin Phe Gly Arg Lys val ser Pro Gly 
210 215 220 

Gin val Lys ser Ala Leu Leu Lys Leu Gly Phe Gin Asp Val Leu Glu 
225 230 235 240 

Ala Ala Leu Gly Ala Asp Arg Thr lie Glu Leu Glu Ala Arg Glu Tyr 
245 ~ 250 255 

Asp Glu Arg Leu Ala His Gly Glu Glu Phe Met Thr ser Ser cys Cys 
260 265 270 

Pro Ala Tyr Val Ser Ala val lie Lys Glu Lys Pro Asp Leu Phe His 
275 280 285 

His lie ser ser Thr Leu Ser Pro Met Ala Gin Val Ala His lie Leu 
290 295 300 

Lys Glu Lys Asp Pro Glu Ala Lys lie Ala Phe lie Gly Pro Cys Val 
305 310 315 320 

Ala Lys Lys Glu Glu Gly Lys Arg Pro Glu Thr Lys Val Asp Phe Val 
325 330 335 

Page 121 



WO 2005/072262 PCT/US2005/001983 

050118 CIP Sequence Listing 

Leu Thr Phe Glu Glu Leu Met Val Trp Leu Asp Tyr Ala Gly lie Asn 
340 345 350 

Pro Ala Glu Glu ser Glu Gin 

355 

<210> 75 

<211> 790 

<212> PRT 

<213> Geobacter metal! i reducens 

<400> 75 

Met cys His Trp Leu His Arg Glu Ala Gly Leu val Tyr Asp Pro Ala 
1 5 10 15 

Val Asp Gin Ala lie Asn Arg val ser Gly Leu Thr Leu Ser Ala Gly 
20 25 30 

Arg Thr Met Glu Pro lie lie Thr val Lys Glu Lys Cys Arg Lys Cys 
35 40 45 

Tyr cys cys val Arg Ser cys Pro Val Lys Ala lie Lys val Ala Lys 
50 55 60 

Ser Tyr Thr Glu lie lie Val Asp Arg cys lie Gly cys Gly Asn Cys 
65 70 75 80 

Leu Ser Asn Cys Pro Gin Gin Ala Lys Met Val Ala Asp Lys Val Glv 
85 90 95 

Val Thr Glu Lys Leu Leu ser ser Gly Glu Glu val lie Ala Val Leu 
100 105 110 

Gly ser Ser Phe Pro Ala Phe Phe His Asn val Thr Pro Gly Gin Leu 
115 120 125 

Val Ala Gly Leu Arg Lys lie Gly Phe Ala Glu val His Glu Gly ser 
130 135 140 

Tyr Gly Ala Glu Leu lie Ala Asp Asp Tyr Ala Arg He Thr ser Glu 
145 150 155 160 

Lys Gly His pro Arg lie Ser Ser His Cys Pro Ala lie val Asp Leu 
165 170 175 

lie Glu Arg His Tyr Pro Lys Leu val Gly Asn Leu Val Pro Val Val 
180 185 190 

Ser pro Met Val Ala Met Gly Arg Tyr Leu Lys Gly Thr Leu Gly Gin 
195 200 205 

His val Arg Val Val Tyr lie ser ser cys Val Ala Asn Lys Leu Glu 
210 215 220 

Thr Gin Thr Gin Glu Thr Arg Gly Ala Val Asp lie Val Leu Thr Tyr 
225 230 235 240 
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Arg Glu Leu Glu Gly lie Phe Arg ser Arg Gin He Ala Leu Pro Ala 
245 250 255 

Leu Ala Asp Glu Pro Leu Asp Gly lie Arg Pro Gly Ala Gly Arg Leu 
260 265 270 

Phe Pro lie Ala Asp Gly Thr Phe Arg Ala Phe Gly lie Pro Phe Asp 
275 280 285 

Pro Leu Asp Thr Glu lie Val Ala Ala Cys Gly Glu Val Asn val Met 
290 295 300 

Gly lie lie Asn Asp Leu Ala Ala Gly Arg lie Ser Pro Arg lie Ala 
305 310 315 320 

Asp Leu Arg Phe Cys Tyr Asp Gly Cys lie Gly Gly Pro Gly Arg Asn 
325 330 335 

Arg Ala Leu Thr Glu Phe Tyr Arg Arg Asn Arg val lie Ala His Phe 
340 345 350 

Lys Gin Glu val Pro cys Arg Thr Val Pro Asn Ser Leu Leu Glu Ala 
355 360 365 

Gly Arg Val ser Phe Gly Arg Ser Phe Ala Ser Lys Tyr Ala Lys Leu 
370 375 380 

Glu Ala Pro Lys Ala Asn Asp Val Arg Lys lie Leu Asn Ala Thr Asn 
385 390 395 4O0 

Lys Phe Thr Val Lys Asp Glu Leu Asn cys Arg Ala Cys Gly Tyr Arg 
405 410 415 

Thr cys Arg Glu Tyr Ala Val Ala Val Phe Gin Gly Leu Ala Glu lie 
420 425 430 

Glu Met Cys Leu Pro Tyr Asn Leu Gin Gin Leu Glu Glu Asp Arg Gly 
435 440 445 

Arg Leu lie Gin Lys Tyr Glu Leu Ala Arg Arg Glu Leu Glu Arg Glu 
450 455 460 

Tyr Gly Asp Glu Phe lie val Gly Asn Asp Arg Lys Thr Leu Asp Val 
465 470 475 480 

Leu Gly Leu lie Lys Gin Val Gly Pro Thr Pro Thr Thr val Leu lie 
485 * 490 495 

Arg Gly Glu Ser Gly Thr Gly Lys Glu Leu Thr Ala Arg Ala lie His 
500 505 510 

Arg Tyr Ser Lys Arg Asn Asp Lys Pro Leu val Thr val Asn cys Thr 
515 520 525 

Thr lie Thr Asp ser Leu Leu Glu ser Glu Leu Phe Gly His Lys Arq 
530 535 540 
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Gly Ala Phe Thr Gly Ala Val Ala Asp Lys Lys Gly Leu Phe Glu Ala 
545 550 555 560 

Ala Asp Gly Gly Thr lie Phe Leu Asp Glu lie Gly Asp lie Thr Pro 
565 570 575 

Lys Leu Gin Ala Glu Leu Leu Arg Val Leu Asp Met Gly Glu val Arg 
580 585 590 

Pro Val Gly Gly Thr Ala Ala Lys Lys Val Asp Val Arg Leu lie Ala 
595 600 605 

Ala Thr Asn Lys Asn Leu Glu Gin Gly Val Arg Glu Gly Trp Phe Arg 
610 615 620 

Glu Asp Leu Tyr Tyr Arg Leu Asn Val Phe Thr lie Thr Met Pro Pro 
625 630 635 640 

Leu Arg Ser Arg val Glu ser lie Pro lie Leu Val His His Phe Met 
645 650 655 

Asp Lys Ala Ser Thr Lys Leu Asn Lys Arg Met val Gly lie Glu Asp 
660 665 670 

Arg Ala Val Lys Ala Leu Thr Lys Tyr Pro Trp Pro Gly Asn lie Arg 
675 680 685 

Glu Met Gin Asn val lie Glu Arg Ala Ala Val Leu Thr His Asp Gly 
690 695 700 

Val lie Arg Val Glu Asn Phe Pro Leu Ala Leu Ser Glu Gly Leu Glu 
705 710 715 720 

Glu Gly Phe Ala Thr Gly Leu Asp lie His Ala Ala ser Phe Arg ser 
725 730 735 

Glu Arg Glu Gin His Met Gly Lys Leu Glu Lys Lys Leu lie Gin Arg 
740 745 750 

Tyr Leu Thr Glu Ala Asn Gly Asn lie ser Arg Ala Ala Lys Leu Ala 
755 760 765 

Asn lie Pro Arg Arg Thr Phe Tyr Arg Leu Leu Asp Lys Tyr Arg Leu 
770 775 ~ 780 

Arg Glu Arg Asp Val Arg 
785 790 

<210> 76 
<211> 450 
<212> PRT 

<213> Clostridium acetobutylicum 
<400> 76 

Met Asn Asn Lys Tyr lie Glu Leu Phe Lys Ser Leu Val Asp ser Tyr 
15 10 15 
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Tyr Asn Asp Thr Phe Asp ser Phe val Tyr His He Leu Ser Asp Glu 
20 25 30 

Glu val Asp Lys Lys Glu Leu ser Lys Val lie Ser Ser Leu Cys Gly 
35 40 45 

val ser val Glu Phe Lys Asp Thr Glu Thr Tyr lie ser Glu Leu Lys 
50 55 60 

Lys Ala lie ser Asn Tyr Lys cys Thr Asp Asn lie val Glu Lys lie 
65 70 75 80 

Lys Glu Cys Asp ser ser cys His ser Asn Glu Gly Glu Thr Pro cys 
85 90 95 

Gin Lys Ser Cys Pro Phe Asp Ala lie Leu Val Asp Lys Asn Thr Lys 
100 105 110 

Thr Ser His lie Gin Lys Asp Leu Cys Thr Asp cys Gly Asn Cys lie 
115 120 125 

Thr Ser Cys Pro Ser Gly Ser lie Leu Asp Lys lie Glu Phe Met Pro 
130 135 140 

Leu Leu Asn Leu Phe Lys Asn Asn Glu Thr Val lie Ala Ala Val Ala 
14 5 150 155 160 

Pro Ala lie Ala Gly Gin Phe Gly Glu Asn val Ser Leu Glu Met Leu 
165 170 175 

Arg Thr Ala Phe Lys Lys Val Gly Phe Ala Asp Met Val Glu Val Ala 
180 185 190 

Phe Phe Ala Asp Met Leu Thr lie Lys Glu Ala Phe Glu Phe Asn Glu 
195 200 205 

Leu val Asn ser Lys Asp Asp Leu Met lie Thr ser cys Cys cys Pro 
210 215 220 

Met Trp val ser Met lie Arg Lys lie Tyr Lys Asp Leu Ala Arg His 
225 230 235 240 

Val ser Pro ser val ser Pro Met lie Ala ser Gly Arg val lie Lys 
245 250 255 

Lys Leu Asn Pro Asn Cys Lys val val Phe lie Gly Pro Cys lie Ala 
260 265 270 

Lys Lys Ala Glu ser Arg Ser Gin Asp lie ser Asp Ala lie Asp Phe 
275 280 285 

Val Leu Thr Phe Glu Glu Leu Lys Gly lie Phe Asp val Leu Asp lie 
290 295 300 

Asp Pro Glu Lys Leu Pro Glu Thr His Thr Lys ser Tyr Ala ser Arg 
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315 320 

Glu Gly Arg Leu Tyr Gly Arg Thr Gly Gly Val ser Thr Ser val Asp 
325 330 335 

Glu Ala val Lys Arg lie Phe Pro Asn Lys His His Leu Phe Lys Ser 
340 345 350 

Thr Lys Val Asp Gly Val Lys Asp cys Lys Asp lie Leu Asn Lys Thr 
355 360 365 

Gin Ala Gly Asn lie Gly Ala Asn Phe Leu Glu Gly Met Gly Cys Val 
370 375 380 

Gly Gly cys Val Gly Gly Pro Lys Ala lie val His Lys Asp Gin Gly 
385 390 395 400 

Arg Glu ser val Asn Lys Thr Ala Glu Ser Ser Glu lie Lys lie Ser 
405 410 415 

val Asp ser Glu Arg Met Lys Asp lie Leu Ser Arg lie Gly lie Asn 
420 425 430 

Ser lie Glu Asp Phe Gly Asp Lys ser Lys Val Asp He Phe Glu Arg 
435 440 445 

Arg Phe 
450 

<210> 77 

<211> 106 

<212> PRT 

<213> shewanella oneidensis 

<400> 77 

Met Asn Lys Lys Lys His Leu Phe Ala Glu Asp Ser Phe Phe Leu Ser 
1 5 10 15 

Arg Arg Lys Phe Met Ala Val Gly Ala Ala Phe Val Ala Ala Leu Ala 
20 25 30 

lie Pro lie Gly Trp Phe Thr ser Lys Leu Glu Arg Arg Asn Glu Tyr 
35 40 45 

lie Lys Ala Arg ser Gin Gly Leu Tyr Lys Asp Asp ser Leu Ala Lys 
50 55 60 

Thr Arg val Ser His Ala Asn Pro Ala Val Glu Lys Tyr Tyr Lys Glu 
65 70 75 80 

Phe Gly Gly Glu Pro Leu Gly His Met Ser His Glu Leu Leu His Thr 
85 90 95 

His Phe Val Asp Arg Thr Lys Leu Ser Ser 
100 105 

<210> 78 
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<212> PRT 

<213> Entamoeba histolytica 
<400> 78 

Met Ser Thr Gin Leu Thr Pro Leu Arg Asn Lys lie lie ser Glu Val 
15 10 15 

val Lys cys Phe Lys ser Gly Arg Phe lie Glu Asp lie Asp Lys Leu 
20 25 30 

Pro Thr lie Leu Thr Asp Gly Asp Gly Trp Lys Pro Thr Ser Lys Phe 
35 40 45 

Val His Ser Arg Glu Gin Glu Glu Gly He Tyr Arg Glu Lys Val Leu 
50 55 60 

Ser val Leu Gly Phe Val Asp Gly Glu Tyr Asp Asp lie Thr Pro Leu 
6 5 70 75 80 

His val Tyr Ala Gin Lys Ala Leu Glu Arg Thr ser Leu His Glu Pro 
85 90 95 

Val Phe Gly lie Ser Gin Lys Gly cys Asn Lys Cys His Phe Asn Gly 
100 105 110 

Tyr Phe val Thr Gin Ala cys Glu Gly Cys Thr ser Arg Pro cys ser 
115 120 125 

val Asn cys Pro Lys Lys Cys lie ser Phe Gly Glu Asp Gly Arg Ala 
130 135 140 

Val lie Asn Gin Asn Asn Cys lie Lys Cys Gly Arg cys Tyr Lys Phe 
145 150 155 160 

Cys Pro Tyr Gly Ala lie lie ser Lys ser val Pro cys val Lys Ala 
165 170 175 

cys Pro cys Gly Ala Met Leu Asp Ser Pro Glu Gly val Lys Thr lie 
180 185 190 

Asp Phe Glu Lys cys lie Asn cys Gly Gly cys Met Arg Ala Cys Pro 
195 200 205 

Phe Gly Ala lie Leu Pro Arg ser Asn Leu lie Asp val Leu Lys lie 
210 215 220 

Leu Pro Thr Lys Lys Val Val Ala Cys Pro Ala Pro Ser lie Ala Ala 
225 230 235 240 

His Phe Gly Lys Tyr Asp Leu Ala Leu Val Ser Gly Gly Leu He Gin 
245 250 255 

val Gly Phe Thr ser val Glu Asp Val Ser Tyr Gly Ala Asp Leu cys 
260 265 270 

Ala Leu Asn Glu Ala Lys Glu Phe Glu Glu Arg lie Val Lys Asn Lys 
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27&> 280 285 

Lys Asp Phe Met Thr Thr ser cys cys Pro Ala Tyr He Asn Ala lie 
290 295 300 

Asn Lys His Met Pro Glu Leu Lys Glu Asn val Ser His Thr Pro Thr 
305 310 315 320 

Pro Met His Phe Ala Thr Gin Ala Val Lys Asp Arg Asp Gin Glu Thr 
325 330 335 

Val Thr val Phe lie Gly Pro cys Asn Ala Lys Arg Trp Glu Thr Leu 
340 345 350 

Gin Asp ser Thr Thr Asp Tyr cys Leu Thr Phe Asp Glu lie Phe Gly 
355 360 365 

Leu Phe Glu Gly ser Gly lie Asp Leu ser Lys val Gin Pro Tyr Thr 
370 375 380 

Phe val Asp Lys Ala His Lys Glu Gly Lys lie Phe Ala Val Ser Gly 
385 390 395 400 

Gly val Ala ser Ala Val Ala Ser Leu Leu Pro Lys Glu Val Pro Asp 
405 410 415 

Gly val lie Lys Pro Thr lie lie Asp Gly Phe ser Gin Glu Asn Phe 
420 425 430 

Lys Arg Leu Lys Asn Phe Lys Lys Asn lie Thr Gly Asn Leu val Glu 
435 440 445 

val Met Val Cys Glu Gly Gly cys Ala Tyr Gly Pro Gly cys Pro Gly 
450 455 460 

Leu Asn Thr Pro Ala Thr Ser Ala Lys lie Lys lie Ala val Asp Lys 
465 470 475 480 

Met Glu Ala His Pro Glu Gly Arg Trp val Gly Leu Pro Asn ser Gin 
485 490 495 

lie Lys Pro lie Lys Val Glu Asn 
500 

<210> 79 
<211> 560 
<212> PRT 

<213> Cryptosporidium parvum 
<400> 79 

Met Phe ser Thr Ala val Lys Leu Ala Asn Leu Asp Asp Tyr Leu Glu 
1 5 10 15 

Ser ser Gin Asp Cys lie Val Ser Leu Leu ser Asp Lys Asp Asp Thr 
20 25 30 

Lys Pro Lys lie Ala Val Met Arg pro Ala Lys Ala Gin Gly Asn Lys 
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Asp Asp Lys Lys Ser Gly Thr ser Asp Lys Ala Thr Val Asn Val Ala 
50 55 60 

Asp cys Leu Ala cys Ser Gly cys val Thr ser Ala Glu Ala Lys Leu 
65 70 75 80 

Leu Glu Asp Gin Asn Val Ser Glu Phe Met Asn lie Leu Lys Gin Lys 
85 90 95 

Arg Leu Thr Val val Ser lie Ser Asn Gin Ser Cys Ser ser Phe Ala 
100 105 110 

Cys His Leu Asn cys Asp Leu lie Thr lie Gin Arg Lys Leu ser Gly 
115 120 125 

Leu Phe Lys His lie Gly Ala Arg Phe Val Met Asn Ser Thr lie Ser 
130 135 ~ 140 

Glu Tyr lie Ser Leu Leu Glu Thr Lys Tyr Glu Phe lie Ser Arq Tyr 
145 150 155 160 

Lys Ala Lys ser Asp Leu Pro Met lie lie Ser His cys Pro Gly Trp 
165 170 175 

lie cys Tyr ser Glu Lys ser Leu Asn ser ser Val Leu Pro Leu Leu 
180 185 190 

Ser Lys val Arg ser Ala Gin Gin Leu Gin Gly lie Leu lie Lys Thr 
195 200 205 

Leu Thr Leu Glu lie Tyr Asn Gin Leu Leu Phe Leu Tyr Lys Phe Arq 
210 215 220 

Leu Ser Asn Ser Tyr Arg Thr Asn Met Asn Val Lys Ser Thr Phe Thr 
225 230 235 240 

Gin Asn Asp Asn Phe Val Glu Gin Ser Asp lie Phe His val Ala He 
245 250 255 

Met Pro Cys His Asp Lys Lys Leu Glu ser Thr Arg Ser ser Leu Ser 
260 265 270 

Leu Lys ser Ser Asp Lys Asn ser Ser cys Pro Glu val Asp lie val 
275 280 285 

Leu Ala Thr Ser Glu Val Gly Glu lie lie Lys Leu Ala Gly Phe Asn 
290 295 300 

Ser Leu Leu Asp val Pro Glu Ala Pro Leu Asp Asn Leu Trp Leu Asn 
305 310 315 320 

Gin Asn Phe Gin lie Thr Lys Lys His Asn Leu Ser Leu Leu lie Thr 
325 330 335 

Page 129 



WO 2005/072262 PCT/US2005/001983 

v «n-»-n. 050118 CIP sequence Listing 
a p oft if Asn$ UlEi ykU S!bftlMn:;!iGl n He Leu Asn Gin Phe ser Trp Leu lie 
1 " '"" 340 345 350 

Pro ser Tyr Phe Asn Ser Asn Ser Gly Gly Phe cys Glu Tyr lie lie 
355 360 365 

Arg Ser Ala lie Lys Glu Leu Ala Gly Asp His lie Asp Asn Lys val 
370 375 380 

Gin Leu Pro Phe Asn Lys Leu Lys Asn Asp lie Leu Glu Ala Lys Tyr 
385 390 395 400 

lie Lys Asn Asn Val Glu Leu Asn Tyr cys Leu Ala Tyr Gly Phe Arq 
405 410 415 

Ala lie Gin ser lie ser Arg Lys Leu Asn Leu Gin Lys Asn Ala Ser 
420 425 430 

Gin Asn Thr Gin Tyr Lys Gin Ser val val Asn His val Asn Tyr His 
435 440 445 

Leu lie Glu Ala Met Ala Cys Pro Thr Gly Cys Val Ser Gly Gly Glv 
450 455 460 

Gin lie Leu Ser Gin Asn Asp Gin Asn Asp Asp Asn ser Asp Leu Asn 
465 470 475 480 

Lys Leu Arg Lys Asn lie Lys Phe lie Asp Glu Val Gin Glu Ala Leu 
485 490 495 

Tyr Lys Gly lie Asn Leu Asn Lys Asn Gin Glu val lie Leu Pro Asp 
500 505 510 

Glu lie Pro lie Val Asn lie Leu Tyr Glu Tyr Leu lie His lie Asp 
515 520 525 

Lys Gin lie Asp Arg ser Ser Gly Leu Lys Leu Pro Phe Leu Arg Asn 
530 535 540 

Asp Phe Val Ser lie Asn Glu val Pro Thr Ala Ser Ser Leu Lys Trp 
545 550 555 560 

<210> 80 

<211> 469 

<212> PRT 

<213> Kluyveromyces lactis 

<400> 80 

Met ser Ala Leu Leu Arg Asp Ala Asp Leu Asn Asp Phe lie Ser Pro 
1 5 10 15 

Gly Leu Ala cys val Lys Pro Ala Gin Pro Gin Lys val Glu Lys Lys 
20 25 30 

Pro ser Phe Glu val Glu val Gly lie Glu Ser ser Glu Pro Glu Lys 
35 40 45 
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50 55 60 

Ser ser Glu Glu lie Leu Leu Ser Lys Gin Ser His Lys Val Phe Leu 
65 70 75 80 

Glu Lys Trp ser Glu Leu Glu Glu Leu Asp Glu Arg ser Leu Ala val 
85 90 95 

ser lie Ser Pro Gin cys Arg Leu Ser Leu Ala Asp Tyr Tyr ser Met 
100 105 110 

Cys Leu Ala Asp Leu Asp Arg Cys Phe Gin Asn Phe Met Lys Thr Lys 
115 120 125 

Phe Asn Ala Lys Tyr Val Val Gly Thr Gin Phe Gly Arg ser He ser 
130 135 140 

lie ser Arg lie Asn Ala Thr Leu Lys Asp Arg Val Pro Glu Asn Glu 
145 150 155 160 

Gly Pro Leu Leu Cys Ser Val cys Pro Gly Phe Val Leu Tyr Ala Glu 
165 170 175 

Lys Thr Lys Pro Glu Leu lie Pro His Met Leu Asp val Lys ser Pro 
180 185 190 

Gin Gin lie Thr Gly Asn Leu Leu Lys Gin Ala Asp Pro Thr Cys Tyr 
195 200 205 

His Leu ser lie Met Pro cys Phe Asp Lys Lys Leu Glu Ala ser Arq 
210 215 220 

Glu Glu cys Glu Lys Glu val Asp Cys val lie Thr Pro Lys Gin Phe 
225 230 235 240 

val Ala Met Leu Gly Asp Leu ser lie Asp Phe Lys ser Tyr Met Thr 
245 250 255 

Glu Tyr Asp ser Ser Lys Glu Leu cys Pro ser Gly Trp Asp Tyr Lys 
260 265 270 

Leu His Trp Leu Ser Asn Glu Gly Ser Ser ser Gly Gly Tyr Ala Tyr 
275 280 285 

Gin Tyr Leu Leu Ser Leu Gin Ser Ser Asn Pro Glu Ser Asp lie lie 
290 295 300 

Thr lie Glu Gly Lys Asn ser Asp val Thr Glu Tyr Arg Leu Val ser 
305 310 315 320 

Lys Ser Lys Gly val lie Ala Ser Ser Ser Glu val Tyr Gly Phe Arg 
325 330 335 

Asn lie Gin Asn Leu val Arg Lys Leu ser Gin Ser Ala ser val Lys 
340 345 350 
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Lys Arg <aiy xie Lys va I Lys Arg Arg Gly Gin Ser Val Leu Lys Ser 
355 360 365 

Gly Glu Thr Ser Glu Lys Thr Thr Lys Val Leu Thr Ala Asp Pro Ala 
370 375 380 

Lys Thr Asp phe val Glu val Met Ala cys Pro ser Gly cys lie Asn 
385 390 395 400 

Gly Gly Gly Leu Leu Asn Glu Glu Lys Asn Ala Asn Arg Arg Lys Gin 
405 410 415 

Leu Ala Gin Asp Leu ser Leu Ala Tyr Thr Lys val His ser Val Asn 
420 425 ' 430 

lie Pro Asp lie Val His Ala Tyr Asp Asp Lys ser Asn Asp Phe Lys 
435 440 445 

Tyr Asn Leu Arg Val lie Glu Pro Ser Thr ser ser Asp val val Ala 
450 455 460 

val Gly Asn Thr Trp 
465 

<210> 81 
<211> 365 
<212> PRT 

<213> Encephali tozoon cuniculi 
<400> 81 

Met Asp Ala Leu lie Arg Pro Pro Met Ser Phe Phe Ala Asp Leu Pro 
1 5 10 15 

Lys Asp Asn Lys Lys Cys lie Lys lie Gly Ser Pro Leu Ala Leu Ser 
20 25 30 

Leu ser Asp cys Leu Ala cys ser Gly cys val Ser Ala Asp Glu Ala 
35 40 45 

Gly Ala Leu ser Glu Asp Leu Ser Phe Val Leu Asp Leu Ser Pro Gin 
50 55 60 

Thr ser Phe val Leu Ser Pro Gin Ser Lys He Asn lie Phe Asn Leu 
65 70 75 80 

Tyr Arg Glu Asp Gly Met Glu Tyr Arg Glu Phe Glu Ala Val Leu ser 
85 90 95 

Ser Phe Leu Arg Ser Lys Phe Asn lie His Arg lie Val Asp Thr Ser 
100 105 110 

Tyr Leu Arg Ser Lys lie Tyr Glu Glu Thr Tyr Arg Glu Tyr Met Ala 
115 120 125 

Thr Asn His Leu lie Val Ser Ala cys Pro Gly val Val Thr Tyr lie 
130 135 140 
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Glu Arg Thr Ala Pro Tyr Leu lie Gly Tyr Leu Ser Arg val Lys ser 
145 150 155 160 

Pro Gin Gin Met Ala Phe ser Leu val Lys Gly Ser Arg Thr Val Ser 
165 170 ' 175 

val Met Pro Cys Gin Asp Lys Lys Leu Glu Asn Gly Arg Asp Gly Val 
180 185 190 

Lys Phe Asp Phe lie Leu Thr Thr Arg Gly Phe cys Lys Ala Leu Asp 
195 200 205 

ser Leu Gly Phe Arg Arg Pro Ala Arg Ala Ser Gly Lys Ser Leu Cys 
210 215 220 

Ser Met Glu Glu Ala Glu Thr Thr Gin Trp Asn lie Gly Thr ser ser 
225 230 235 240 

Gly Gly Tyr Ala Glu Phe lie Leu Gly Lys His Cys Val Val Glu Thr 
245 250 255 

Arg Glu lie Arg Asn Gly lie Lys Glu His Leu Leu Asp Asp Gly Arg 
260 265 270 

Thr lie Ser Gin lie Thr Gly Leu Glu Asn Ser lie Asn Tyr Phe Lys 
275 280 285 

Ser ser Lys Thr Lys Gly Pro Arg His Lys Met Thr Glu lie Phe Leu 
290 295 300 

Cys Lys Asn Gly cys lie Gly Gly Pro Gly Gin Glu Arg val Asn Asp 
305 310 315 320 

val Glu Met Asp lie Arg Glu Tyr Asp Arg Asn Gly Arg Glu Gin Pro 
325 ~ 330 ~ 335 

Arg lie Phe Tyr Ser Ser Pro Gly Leu Glu Glu Lys Arg Val Phe Arg 
340 345 350 

Glu Val Lys Ala Lys Arg Val Asp Leu Arg val Asp Trp 
355 360 365 

<210> 82 

<211> 127 

<212> PRT 

<213> Tri trichomonas foetus 
<220> 

<221> mi sc_feature 

<222> (85). .(85) 

<223> Xaa can be any naturally occurring amino acid 
<220> 

<221> mi sc_feature 

<222> (124).. (124) 

<223> Xaa can be any naturally occurring amino acid 

<400> 82 
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ser val Ala Gly Gin Gly val Leu Lys 

"1 " "" 5 10 15 

Leu Val Lys Val Gly Asn Lys Lys Leu Val ser Thr Lys Ser Gly Lys 
20 25 30 

Pro Leu Gin Glu Thr Asn cys lie Lys cys Gly Gin cys Thr Leu Val 
35 40 45 

Cys Gly Pro Gly Ala Leu Thr Gin Lys Asp Ala lie Gin Thr val Ser 
50 55 60 

Glu val Leu Lys Asn Pro Gly Asp Lys Val Leu Val Cys Gin Thr Ala 
65 70 75 80 

Pro Ala lie Arg xaa Asn Leu Ala Asp Gly Leu Gly Met Pro Ala Gly 
85 90 95 

Ser lie lie Thr Gly Lys Met val Thr Ala Leu Lys Met Leu Gly Phe 
100 105 110 

Lys Tyr Val Phe Asp Thr Asn Phe Gly Thr Asp Xaa Thr lie Gly 
115 120 125 

<210> 83 

<211> 449 

<212> PRT 

<213> scenedesmus obliquus 

<400> 83 

Met Pro Glu Trp Gin Pro Gly Gly Arg Tyr Ala Val Ser val Arg Pro 
1 5 10 15 

Pro val Asn Arg Arg Ala val val Ala Ala Glu Arg Arg Arg Leu Val 
20 25 30 

val Arg Ala Ala Gly Pro Thr Ala Glu cys Asp cys Pro Pro Ala Pro 
35 40 45 

Ala Pro Lys Ala Pro His Trp Gin Gin Thr Leu Asp Glu Leu Ala Lys 
50 55 60 

Pro Lys Glu Gin Arg Lys val Met He Ala Gin lie Ala Pro Ala val 
65 70 75 80 

Arg val Ala lie Ala Glu Thr Met Gly Leu Asn Pro Gly Asp val Thr 
85 90 95 

val Gly Gin Met val Thr Gly Leu Arg Met Leu Gly Phe Asp Tyr val 
100 105 110 

Phe Asp Thr Leu Phe Gly Ala Asp Leu Thr lie Met Glu Glu Gly Thr 
115 120 125 

Glu Leu Arg His Arg Leu Gin Asp His Leu Glu Gin His Pro Asn Lys 
130 135 140 
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€^y^MM^h^X!^mmPhe Thr ser cys Cys Pro Gly Trp val Ala 
145 150 155 160 

Met Val Glu Lys Ser Asn Pro Glu Leu lie Pro Tyr Leu Ser Ser Cys 
165 170 175 

Lys ser Pro Gin Met Met Leu Gly Ala Val lie Lys Asn Tyr Phe Ala 
180 185 190 

Ala Glu Ala Gly Ala Lys Pro Glu Asp lie cys Asn val ser val Met 
195 200 205 

Pro cys val Arg Lys Gin Gly Glu Ala Asp Arg Glu Trp Phe Asn Thr 
210 215 220 

Thr Gly Ala Gly Gly Ala Asn val Asp His val Met Thr Thr Ala Glu 
225 230 235 240 

Leu Gly Lys lie Phe Val Glu Arg Gly lie Lys Leu Asn Asp Leu Gin 
245 250 255 

Glu Ser Pro Phe Asp Asn Pro Val Gly Glu Gly Ser Gly Gly Gly Val 
260 265 270 



Leu Phe Gly Thr Thr Gly Gly Val Met Glu Ala Ala Leu Arg Thr Val 
275 280 285 



Tyr Glu Val Val Thr Gin Lys Pro Leu Asp Arg lie Val Phe Glu Asp 
290 295 300 

Val Arg Gly Leu Glu Gly lie Lys Glu Ser Thr Leu His Leu Thr Pro 
305 310 315 320 

Gly Pro Thr Ser Pro Phe Lys Ala Phe Ala Gly Ala Asp Gly Thr Gly 
325 330 335 

lie Thr Leu Asn lie Ala Val Ala Asn Gly Leu Gly Asn Ala Lys Lys 
340 345 350 

Leu lie Lys Gin Leu Ala Ala Gly Glu ser Lys Tyr Asp Phe lie Glu 
355 360 365 

Val Met Ala Cys Pro Gly Gly Cys lie Gly Gly Gly Gly Gin Pro Arq 
370 375 380 

Ser Ala Asp Lys Gin lie Leu Gin Lys Arg Gin Ala Ala Met Tyr Asp 
385 390 395 400 

Leu Asp Glu Arg Ala Val lie Arg Arg ser His Glu Asn Pro Leu lie 
405 410 415 

Gly Ala Leu Tyr Glu Lys Phe Leu Gly Glu Pro Asn Gly His Lys Ala 
420 425 430 

His Glu Leu Leu His Thr His Tyr Val Ala Gly Gly val Pro Asp Glu 
435 440 445 
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<210> 84 

<211> 477 

<212> PRT 

<213> Anopheles gambiae 

<400> 84 

Ser Arg Phe Ser Ser Ala Leu Gin Leu Thr Asp Leu Asp Asp Phe lie 
1 5 10 15 

Thr Pro Ser Gin Glu Cys lie Lys Pro Val Lys lie Glu Thr ser Lys 
20 25 30 

ser Lys Thr Gly Ala Lys lie Thr lie Gin Glu Asp Gly Ser Tyr val 
35 40 45 

Gin Glu Ser ser Ser Gly lie Gin Lys Leu Glu Lys val Glu lie Thr 
50 55 60 

Leu Ala Asp cys Leu Ala Cys ser Gly Cys lie Thr ser Ala Glu Gly 
65 70 75 80 

val Leu lie ser Gin Gin ser Gin Glu Glu Leu Leu Arg val Met Asn 
85 90 95 

Ala Asn Asn Leu Ala Lys Leu Asn Asn Gin Arg Asp Glu lie Lys Phe 
100 105 110 

val Val Phe Thr val Ser Gin Gin Pro lie Leu Ser Leu Ala Arg Lys 
115 120 125 

Tyr Asn Leu Thr Pro Glu Asp Thr Phe Glu His lie Ala Gly Tyr Phe 
130 135 140 

Lys Lys Leu Gly Ala Asp Met val Val Asp Thr Lys lie Ala Asp Asp 
145 150 155 160 

Leu Ala Leu lie Glu Cys Arg Asn Glu Phe lie Glu Arg Tyr Asn Thr 
165 170 175 

Asn Arg Lys Leu Leu Pro Met Leu Ala Ser Ser cys Pro Gly Trp val 
180 185 190 

Cys Tyr Ala Glu Lys Thr His Gly Asn Phe lie Leu pro Tyr lie Ala 
195 200 205 

Thr Thr Arg Ser Pro Gin Gin He Met Gly val Leu Val Lys Gin Tyr 
210 215 220 

Leu Ala Lys Gin Leu Gin Thr Thr Gly Asp Arg lie Tyr His Val Thr 
225 230 235 240 

Val Met Pro Cys Tyr Asp Lys Lys Leu Glu Ala ser Arg Glu Asp Phe 
245 250 255 
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Phe Ser Glu val Glu Asn ser Arg Asp val Asp cys val lie Thr ser 
260 265 270 

lie Glu lie Glu Gin Met Leu Asn Ser Leu Asp Leu Pro Ser Leu Gin 
275 280 285 

Leu Val Glu Arg Cys Ala lie Asp Trp Pro Trp Pro Thr val Arg Pro 
290 295 300 

Ser Ala Phe Val Trp Gly His Glu Ser ser Gly ser Gly Gly Tyr Ala 
305 310 315 320 

Glu Tyr lie Phe Lys Tyr Ala Ala Arg Lys Leu Phe Asn val Gin Leu 
325 330 335 

Asp Thr val Ala Phe Lys Pro Leu Arg Asn Asn Asp Met Arg Glu Ala 
340 345 350 

Val Leu Glu Gin Asn Gly Gin val Leu Met Arg Phe Ala lie Ala Asn 
355 360 ^ 365 

Gly Phe Arg Asn lie Gin Asn Met Val Gin Lys Leu Lys Arg Gly Lys 
370 375 380 

Ser Thr Tyr Asp Tyr Val Glu lie Met Ala Cys Pro Ser Gly Cys Leu 
385 390 395 400 

Asn Gly Gly Ala Gin lie Arg Pro Glu Glu Gly Arg Ala Ala Arg Glu 
405 410 * ~ 415 

Leu Thr Ala Glu Leu Glu cys Met Tyr Arg Ser Leu Pro Gin ser Thr 
420 425 ~ 430 

Pro Glu Asn Asp Cys val Gin Thr Met Tyr Ala Thr Phe Phe Asp ser 
435 440 445 

Glu Gly Asp Leu Asn Lys Arg Gin ser Leu Leu His Thr ser Tyr His 
450 455 460 

Gin lie Glu Lys lie Asn Ser Ala Leu Asn lie Lys Trp 
465 470 475 

<210> 85 
<211> 410 
<212> PRT 

<213> shewanella oneidensis 
<400> 85 

Met Thr Thr Thr Thr Tyr Gin Pro Gly Glu lie Gin Gly Leu lie Lys 
15 10 15 

lie Asn Ala ser Lys cys Lys Gly Cys Asp Ala cys Lys Gin Phe Cys 
20 25 30 

Pro Thr His Ala lie Asn Gly Ala ser Gly Ala val His ser lie Asp 
35 40 45 
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GTu Asp lys cys Leu Ser cys Gly Gin cys Leu He Asn Cys Pro Phe 
50 55 60 

ser Ala lie Glu Glu Thr His ser Ala Leu Glu Thr Val lie Lys Lys 
65 70 75 80 

Leu Ala Asp Lys Asn Thr Thr Val val Gly lie lie Ala Pro Ala Val 
85 90 95 

Arg Val Ala lie Gly Glu Glu Phe Gly Leu Gly Thr Gly Glu Leu val 
100 105 110 

Thr Gly Lys Leu Tyr Gly Ala Met Asn Gin Ala Gly Phe Lys lie Phe 
115 120 125 

Asp cys Asn Phe Ala Ala Asp Leu Thr lie Met Glu Glu Gly Ser Glu 
130 135 140 

Phe lie His Arg Leu His Ala Asn Val Lys Gly Glu Ala Asn Ala Gly 
145 150 155 160 

Pro Leu Pro Gin Phe Thr ser cys Cys Pro Gly Trp val Arg Tyr Leu 
165 170 175 

Glu Thr Arg Tyr Pro Ala Leu Leu Pro Asn Leu Ser Thr Ala Lys Ser 
180 185 190 

Pro Gin Gin Met Ala Gly Thr Val Ala Lys Thr Tyr Gly Ala Lys Val 
195 200 205 

Tyr Gin Met Gin Pro Glu Asn lie Phe Thr val Ser val Met Pro cys 
210 215 220 

Thr ser Lys Lys Leu Glu Ala ser Arg Pro Glu Phe Asn ser Ala Trp 
225 230 235 240 

Gin Tyr His Gin Glu His Gly Ala Asn Ser Pro Ser Tyr Gin Asp lie 
245 250 255 

Asp Ala val Leu Thr Thr Arg Glu Met Ala Gin Leu Leu Lys Leu Leu 
260 265 270 

Asp lie Asp Leu Ala Asn Thr Ala Glu Tyr Gin Gly Asp Ser Leu phe 
275 280 285 

ser Glu Tyr Thr Gly Ala Gly Thr lie Phe Gly Thr Thr Gly Gly Val 
290 295 300 

Met Glu Ala Ala Leu Arg Thr Ala His Lys val Leu Thr Gly Thr Glu 
305 310 315 320 

Met Ala Lys Leu Glu Phe Glu Pro Val Arg Gly Leu Lys Gly Val Lys 
325 330 335 

Ser Ala ser Val ser Leu Phe Asp Thr Glu Leu Asn Gin Asp val Thr 
340 345 350 

Page 138 



WO 2005/072262 PCT/US2005/001983 

050118 CIP sequence Listing 

Val Asn Val Ala val Val His Asp Met Gly Asn Asn lie Glu Pro Val 
355 360 365 

Leu Arg Asp val Met Ala Gly Thr ser Pro Tyr His Phe lie Glu Val 
370 375 380 

Met Asn Cys Ala Gly Gly cys val Asn Gly Gly Gly Gin Pro lie Glu 
385 390 395 400 

Gly Lys Gly ser Ser Trp Leu Gly Asn lie 
405 410 

<210> 86 
<211> 606 
<212> PRT 

<213> Clostridium thermocellum 
<400> 86 

Met Ala Phe Val Trp Arg Asn Val Arg ser Arg Pro Phe Pro Lys Lys 
1 5 10 15 

Pro Asn Gly Arg Gly Cys Glu Lys Met Gin Met Val Asn Val Thr lie 
20 25 30 

Asp Asn cys Lys lie Gin Val Pro Ala Asn Tyr Thr val Leu Glu Ala 
35 40 45 

Ala Lys Gin Ala Asn lie Asp lie Pro Thr Leu Cys Phe Leu Lys Asp 
50 55 60 

lie Asn Glu Val Gly Ala Cys Arg Met Cys val Val Glu Val Lys Gly 
65 70 75 80 

Ala Arg ser Leu Gin Ala Ala Cys Val Tyr Pro val ser Glu Glv Leu 
85 90 95 

Glu Val Tyr Thr Gin Thr Pro Ala val Arg Glu Ala Arg Lys Val Thr 
100 105 110 

Leu Glu Leu lie Leu Ser Asn His Glu Lys Lys Cys Leu Thr cys Val 
115 120 125 

Arg ser Glu Asn cys Glu Leu Gin Arg Leu Ala Lys Asp Leu Asn val 
130 135 140 

Lys Asp lie Arg Phe Glu Gly Glu Met ser Asn Leu Pro lie Asp Asp 
145 150 155 160 

Leu ser Pro Ser Val val Arg Asp Pro Asn Lys Cys Val Leu Cys Arq 
165 ~ 170 175 

Arg cys val ser Met Cys Lys Asn val Gin Thr Val Gly Ala lie Asp 
180 185 190 

Val Thr Glu Arg Gly Phe Arg Thr Thr Val Ser Thr Ala Phe Asn Lys 
195 200 205 
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Pro Leu ser Glu val Pro Cys val Asn Cys Gly Gin Cys lie Asn Val 
210 215 220 

Cys Pro val Gly Ala Leu Arg Glu Lys Asp Asp lie Asp Lys Val Trp 
225 230 235 240 

Glu Ala Leu Ala Asn Pro Glu Leu His val Val val Gin Thr Ala Pro 
245 250 255 

Ala Val Arg Val Ala Leu Gly Glu Glu Phe Gly Met Pro lie Gly ser 
260 265 270 

Arg val Thr Gly Lys Met Val Ala Ala Leu Ser Arg Leu Gly Phe Lys 
275 280 285 

Lys phe As P Thr As P Thr Ala Ala As P L eu Thr lie Met Glu Glu 
290 295 300 

Gly Thr Glu Leu lie Asn Arg He Lys Asn Gly Gly Lys Leu Pro Leu 
305 310 315 320 

He Thr ser cys Ser Pro Gly Trp lie Lys Phe Cys Glu His Asn Tyr 
325 330 335 

Pro Glu Phe Leu Asp Asn Leu ser ser cys Lys ser Pro His Glu Met 
340 345 350 

Phe Gly Ala val Leu Lys Ser Tyr Tyr Ala Gin Lys Asn Gly lie Asp 
355 360 365 

Pro ser Lys val Phe Val Val Ser lie Met Pro cys Thr Ala Lys Lys 
370 375 380 

Phe Glu Ala Gin Arg Pro Glu Leu ser ser Thr Gly Tyr Pro Asp val 
385 390 395 400 

Asp val val Leu Thr Thr Arg Glu Leu Ala Arg Met lie Lys Glu Thr 
405 410 415 

Gly lie Asp Phe Asn Ser Leu Pro Asp Lys Gin Phe Asp Asp Pro Met 
420 425 430 

Gly Glu Ala ser Gly Ala Gly Val lie Phe Gly Ala Thr Gly Gly Val 
435 440 445 

Met Glu Ala Ala lie Arg Thr Val Gly Glu Leu Leu ser Gly Lys Pro 
450 455 460 

Ala Asp Lys lie Glu Tyr Thr Glu val Arg Gly Leu Asp Gly lie Lys 
465 470 475 480 

Glu Ala Ser lie Glu Leu Asp Gly Phe Thr Leu Lys Ala Ala val Ala 
485 490 495 

His Gly Leu Gly Asn Ala Arg Lys Leu Leu Asp Lys He Lys Ala Gly 
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ED1B5PC8J 505 510 

Glu Ala Asp Tyr His Phe lie Glu He Met Ala Cys Pro Gly Gly Cys 
515 520 525 

lie Asn Gly Gly Gly Gin Pro lie Gin Pro Ser Ser val Arg Asn Trp 
530 535 540 

Lys Asp lie Arg cys Glu Arg Ala Lys Ala lie Tyr Glu Glu Asp Glu 
545 550 555 560 

Ser Leu Pro lie Arg Lys Ser His Glu Asn Pro Lys lie Lys Met Leu 
565 570 575 

Tyr Glu Glu Phe Phe Gly Glu Pro Gly Ser His Lys Ala His Glu Leu 
580 585 590 

Leu His Thr His Tyr Glu Lys Arg Glu Asn Tyr Pro val Lys 
595 600 605 

<210> 87 

<211> 279 

<212> PRT 

<213> Desulfitobacterium hafniense 

<400> 87 

Met Thr Met Gly Gin Leu Arg Ala Ala Leu Lys His Leu Gly Phe Tyr 
1 5 10 15 

Gly Met lie Glu val Ala Leu Phe Ala Asp val Leu Ser Leu Lys Glu 
20 25 30 

Ala Leu Glu Phe Asp Lys His val Gin Thr Asp Lys Asp Phe Val Leu 
35 40 45 

Thr ser Cys Cys cys Pro lie Trp val Gly Met val Lys Arg Val Tyr 
50 55 60 

Asp Thr Leu Val Pro His lie Ser Pro Ser val Ser pro Met val Ala 
65 70 75 80 

Cys Gly Arg Gly lie Lys Arg Leu His Pro Asp Ala Lys Thr val Phe 
85 90 95 

lie Gly Pro Cys lie Ala Lys Lys Ala Glu Ala Lys Glu Pro Asp lie 
100 105 110 

Arg Asp Ala Val Asp Ala Val Leu Thr Phe His Glu Leu Lys Gin lie 
115 120 125 

Phe Glu Ala Thr Asp lie Glu Pro ser Glu Met Glu Asp lie Pro ser 
130 135 140 

Glu His ser ser Thr Ser Gly Arg lie Tyr Ala Arg Thr Gly Gly val 
145 150 ** 155 " 160 

Ser Lys Ser lie ser Asp Thr Leu Asn Arg lie Arg Pro Asp Lys Pro 
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val Lys He Lys Ser lie Gin Ala Asn Gly lie Lys Glu Cys Lys Ala 
180 185 190 

Leu Leu Asn Asp lie Met Asn Asn Glu lie Lys Ala Asn Phe Tyr Glu 
195 200 205 

Gly Met Gly cys pro Gly Gly cys Val Gly Gly pro Lys Ala lie val 
210 215 220 

Asp val Asp Arg Gly Thr Glu Phe val Asn Lys Tyr Gly Ala Glu Ala 
225 230 235 240 

Asp Ala Leu Thr pro Ala Asp Asn Gin His val Leu Glu Leu Leu lvs 
245 250 255 

Gin Leu Gly lie Asp Ser Val Glu Glu Leu Leu Gly Gly Glu ser Ala 
260 265 270 

Ala lie Phe Gin Arg Asp Phe 
275 

<210> 88 

<211> 505 

<212> PRT 

<213> c. reinhardtii 

<400> 88 

Met Ala Leu Gly Leu Arg Ala Glu Leu Arg Ala Gly Gin Ala val Ala 
1 5 10 15 

Cys Ala Arg Arg Thr Asn Ala Pro Ala His Pro Ala Ala Val val Pro 
20 25 30 

val Leu Pro ser Arg Gly Asp Lys Phe Phe Asn Leu ser Gin Lys Val 
35 40 45 

Pro ser ser Gin Pro Ala Arg Gly ser Thr lie Arg val Ala Ala Thr 
50 55 60 

Ala Thr Asp Ala Val Pro His Trp Lys Leu Ala Leu Glu Glu Leu Asp 
bb 70 75 80 

Lys Pro Lys Asp Gly Gly Arg Lys Val Leu lie Ala Gin Val Ala Pro 
85 90 95 

Ala val Arg yal Ala lie Ala Glu ser Phe Gly Leu Ala Pro Gly Ala 
100 105 110 

val ser Pro Gly Lys Leu Ala Ala Gly Leu Arg Ala Leu Gly Phe Asp 
115 120 125 

Gin yal Phe Asp Thr Leu Phe Ala Ala Asp Leu Thr lie Met Glu Glu 
130 135 140 

Gly Thr Glu Leu Leu His Arg Leu Lys Glu His Leu Glu Ala His Pro 

Page 142 



WO 2005/072262 PCT/US2005/001983 

m- u s o s ,- o x «e 050118 CIP S T 5 T e Li 5t1 " 9 

His Ser Asp Glu Pro Leu Pro Met Phe Thr ser cys cys Pro Gly Trp 
165 170 ' 175 

val Ala Met Met Glu Lys ser Tyr Pro Glu Leu lie Pro Phe val Ser 
180 185 190 

ser cys Lys ser Pro Gin Met Met Met Gly Ala Met val Lys Thr Tyr 
195 200 205 

Leu ser Glu Lys Gin Gly lie Pro Ala Lys Asp He val Met val ser 
210 215 220 

val Met Pro Cys val Arg Lys Gin Gly Glu Ala Asp Arg Glu Trp Phe 
225 230 235 " 240 

Cys val ser Glu Pro Gly Val Arg Asp val Asp His val lie Thr Thr 
245 250 255 

Ala Glu Leu Gly Asn lie Phe Lys Glu Arg Gly lie lie Leu Pro Glu 
260 265 270 

Leu Pro Asp ser Asp Trp Asp Gin Pro Leu Gly Leu Gly ser Gly Ala 
275 280 285 

Gly val Leu Phe Gly Thr Thr Gly Gly val Met Glu Ala Ala val Arg 
290 295 300 

Thr Ala Tyr Glu lie val Thr Lys Glu Pro Leu Pro Arg Leu Asn Leu 
305 310 315 320 

Ser Glu Val Arg Gly Leu Asp Gly lie Lys Glu Ala Ser Val Thr Leu 
325 330 335 

val Pro Ala Pro Gly ser Lys Phe Ala Glu Leu val Ala Ala Arq Leu 
340 345 350 

Ala His Lys Val Glu Glu Ala Ala Ala Ala Glu Ala Ala Ala Ala Val 
355 360 365 

Glu Gly Ala val Lys Pro Pro lie Ala Tyr Asp Gly Gly Gin Gly Phe 
370 375 380 

Ser Thr Asp Asp Gly Lys Gly Gly Leu Lys Leu Arg Val Ala Val Ala 
385 390 395 400 

Asn Gly Leu Gly Asn Ala Lys Lys Leu lie Gly Lys Met val ser Gly 
405 410 415 

Glu Ala Lys Tyr Asp Phe val Glu lie Met Ala cys Pro Ala Gly cys 
420 425 430 

Val Gly Gly Gly Gly Gin Pro Arg ser Thr Asp Lys Gin lie Thr Gin 
435 440 445 
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^MW^^ Asp Leu Asp Glu Arg Asn Thr Leu Arg 

430"' 455 460 

Arg ser His Glu Asn Glu Ala Val Asn Gin Leu Tyr Lys Glu Phe Leu 
465 470 475 480 

Gly Glu Pro Leu ser His Arg Ala His Glu Leu Leu His Thr His Tyr 
485 490 495 

Val Pro Gly Gly Ala Glu Ala Asp Ala 
500 505 

<210> 89 
<211> 505 
<212> PRT 

<213> C. reinhardtii 
<400> 89 

Met Ala Leu Gly Leu Arg Ala Glu Leu Arg Ala Gly Gin Ala val Ala 
1 5 10 15 

cys Ala Arg Arg Thr Asn Ala Pro Ala His Pro Ala Ala Val val Pro 
20 25 30 

Val Leu Pro Ser Arg Gly Asp Lys Phe Phe Asn Leu Ser Gin Lys val 
35 40 45 

Pro Ser Ser Gin Pro Ala Arg Gly Ser Thr lie Arg Val Ala Ala Thr 
50 55 60 

Ala Thr Asp Ala Val Pro His Trp Lys Leu Ala Leu Glu Glu Leu Asp 
65 70 75 80 

Lys Pro Lys Asp Gly Gly Arg Lys Val Leu lie Ala Gin Val Ala Pro 
85 90 95 

Ala val Arg val Ala lie Ala Glu Ser Phe Gly Leu Ala Pro Gly Ala 
100 105 110 

Val Ser Pro Gly Lys Leu Ala Ala Gly Leu Arg Ala Leu Gly Phe Asp 
115 120 125 

Gin val Phe Asp Thr Leu Phe Ala Ala Asp Leu Thr lie Met Glu Glu 
130 135 140 

Gly Thr Glu Leu Leu His Arg Leu Lys Glu His Leu Glu Ala His Pro 
145 150 155 160 

His ser Asp Glu Pro Leu Pro Met Phe Thr ser Cys cys Pro Gly Trp 
165 170 175 

val Ala Met Met Glu Lys Ser Tyr Pro Glu Leu lie Pro Phe Val Ser 
180 185 190 

ser cys Lys ser pro Gin Met Met Met Gly Ala Met Val Lys Thr Tyr 
195 200 205 
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fie Pro Ala Lys Asp lie Val Met val Ser 
215 220 



Val Met Pro Cys val Arg Lys Gin Gly Val Ala Asp Arg Glu Trp Phe 
225 230 235 240 



Cys val Ser Glu Pro Gly Val Arg Asp val Asp His val lie Thr Thr 
245 250 255 



Ala Glu Leu Gly Asn lie Phe Lys Glu Arg Gly lie lie Leu Pro Glu 
260 265 ~ 270 



Leu Pro Asp Ser Asp Trp Asp Gin Pro Leu Gly Leu Gly ser Gly Ala 
275 280 285 



Gly val Leu Phe Gly Thr Thr Gly Gly val Met Glu Ala Ala Val Arg 
290 295 300 



Thr Ala Tyr Glu lie Val Thr Lys Glu Pro Leu Pro Arg Leu Asn Leu 
305 310 315 320 



Ser Glu val Arg Gly Leu Asp Gly lie Lys Glu Ala ser val Thr Leu 
325 330 335 



val Pro Ala Pro Gly ser Lys Phe Ala Glu Leu Val Ala Ala Arg Leu 
340 345 350 



Ala His Lys Val Glu Glu Ala Ala Ala Ala Glu Ala Ala Ala Ala val 
355 360 365 



Glu Gly Ala val Lys Pro Pro lie Ala Tyr Asp Gly Gly Gin Gly Phe 
370 375 380 



Ser Thr Asp Asp Gly Lys Gly Gly Leu Lys Leu Arg Val Ala Val Ala 
385 390 395 " 400 



Asn Gly Leu Gly Asn Ala Lys Lys Leu lie Gly Lys Met Val ser Gly 
405 410 415 



Glu Ala Lys Tyr Asp Phe Val Glu lie Met Ala Cys Pro Ala Gly 
420 425 430 



Val Gly Gly Gly Gly Gin Pro Arg Ser Thr Asp Lys Gin lie Thr Gin 
435 440 445 



Lys Arg Gin Ala Ala Leu Tyr Asp Leu Asp Glu Arg Asn Thr Leu Arg 
450 455 460 



Arg ser His Glu Asn Glu Ala val Asn Gin Leu Tyr Lys Glu Phe Leu 
465 470 475 480 



Gly Glu Pro Leu ser His Arg Ala His Glu Leu Leu His Thr His Tyr 
485 490 495 



Val Pro Gly Gly Ala Glu Ala Asp Ala 



500 



505 
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<210> 90 
<211> 608 
<212> prt 
<213> T. maritima 

<400> 90 

Met Arg Arg Phe Phe Lys Asn Asn Leu Arg Asn Leu ser Gin Asn Gly 
1 5 10 15 

Glu Thr Asn ser val Arg Arg cys Phe Ala Leu Ala Asp Val Thr val 
20 25 30 

val lie Asn Gly Arg Thr Leu Thr val Pro Asp Asn Leu Thr val lie 
35 40 45 

Glu Ala cys Glu Lys Ala Gly lie Glu lie Pro Ala Leu cys His His 
50 55 60 

Pro Arg Leu Gly Glu Ser lie Gly Ala Cys Arg Val Cys Val val Glu 
65 70 75 80 

val Glu Gly Ala Arg Asn Leu Gin Pro Ala cys Val Thr Lys Val Arq 
85 90 95 

Asp Gly Met val lie Lys Thr Ser ser Asp Arg Val Lys Thr Ala Arq 
100 105 110 

Lys Phe Asn Leu Ala Leu Leu Leu ser Glu His Pro Asn Asp Cys Met 
115 120 125 

Thr Cys Glu Ala Asn Gly Arg Cys Glu Phe Gin Asp Leu lie Tyr Lys 
130 135 140 

Tyr Asp val Glu Pro lie Phe Gly Tyr Gly Thr Lys Glu Gly Leu val 
145 150 155 160 

Asp Arg ser ser Pro Ala lie val Arg Asp Leu ser Lys cys lie Lys 
165 170 175 

Cys Gin Arg cys val Arg Ala cys ser Glu Leu Gin Gly Met His He 
180 185 190 

Tyr ser Met Val Glu Arg Gly His Arg Thr Tyr Pro Gly Thr Pro Phe 
195 200 205 

Asp Met Pro val Tyr Glu Thr Asp Cys lie Gly cys Gly Gin cys Ala 
210 215 220 

Ala Phe Cys Pro Thr Gly Ala lie val Glu Asn Ser Ala Val Lys Val 
225 230 235 240 

Val Leu Glu Glu Leu Glu Lys Lys Glu Lys lie Leu Val val Gin Thr 
245 250 255 

Ala Pro ser Val Arg Val Ala lie Gly Glu Glu Phe Gly Tyr Ala Pro 
260 265 270 
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fciy mr lie ser Thr Gly Gin Met val Ala Ala Leu Arg Arg Leu Gly 
275 280 285 

Phe Asp Tyr Val Phe Asp Thr Asn Phe Gly Ala Asp Leu Thr lie Met 
290 295 300 

Glu Glu Gly ser Glu Phe Leu Glu Arg Leu Glu Lys Gly Asp Leu Glu 
305 310 315 320 

Asp Leu Pro Met Phe Thr ser cys cys Pro Gly Trp Val Asn Leu Val 
325 330 335 

Glu Lys Val Tyr Pro Glu Leu Arg Thr Arg Leu Ser ser Ala Lys Ser 
340 345 ~ 350 

Pro Gin Gly Met Leu Ser Ala Met Val Lys Thr Tyr Phe Ala Glu Lys 
355 360 365 

Leu Gly val Lys Pro Glu Asp lie Phe His val ser lie Met Pro Cys 
370 375 380 

Thr Ala Lys Lys Asp Glu Ala Leu Arg Lys Gin Leu Met val Asn Gly 
385 390 395 400 

val Pro Ala val Asp val Val Leu Thr Thr Arg Glu Leu Gly Lys Leu 
405 410 415 

lie Arg Met Lys Lys lie Pro Phe Ala Asn Leu Pro Glu Glu Glu Tyr 
420 425 430 

Asp Ala Pro Leu Gly He Ser Thr Gly Ala Ala Ala Leu Phe Gly val 
435 440 445 

Thr Gly Gly val Met Glu Ala Ala Leu Arg Thr Ala Tyr Glu Leu Lys 
450 455 460 

Thr Gly Lys Ala Leu Pro Lys lie Val Phe Glu Glu Val Arg Gly Leu 
465 470 475 480 

Lys Gly val Arg Glu Ala Glu lie Asp Leu Asp Gly Lys Lys lie Arg 
485 490 495 

lie Ala val val His Gly Thr Ala Asn Val Arg Asn Leu Val Glu Lys 
500 505 ~ 510 

lie Leu Arg Arg Glu Val Lys Tyr His Phe val Glu Val Met Ala cys 
515 520 525 

Pro Gly Gly cys lie Gly Gly Gly Gly Gin Pro Tyr ser Arg Asp Pro 
530 535 540 

Glu lie Leu Arg Lys Arg Ala Glu Ala lie Tyr Thr lie Asp Glu Arg 
545 550 555 560 

Met Thr Leu Arg Lys Ser His Glu Asn Pro Ala He Lys Lys Leu Tyr 
565 570 575 
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Glu Glu Tyr Leu G"lu His Pro Leu ser His Lys Ala His Glu Leu Leu 
580 585 590 

His Thr Tyr Tyr Glu Asp Arg Ser Arg Lys Lys Arg Leu Ala Val Lys 
595 600 605 

<210> 91 

<211> 497 

<212> PRT 

<213> C. reinhardtii 

<400> 91 

Met ser Ala Leu val Leu Lys Pro cys Ala Ala val ser lie Arg Gly 
1 5 10 15 

Ser Ser cys Arg Ala Arg Gin val Ala Pro Arg Ala Pro Leu Ala Ala 
20 25 30 

ser Thr Val Arg Val Ala Leu Ala Thr Leu Glu Ala Pro Ala Arg Arg 
35 40 45 

Leu Gly Asn Val Ala cys Ala Ala Ala Ala Pro Ala Ala Glu Ala Pro 
50 55 60 

Leu Ser His val Gin Gin Ala Leu Ala Glu Leu Ala Lys Pro Lys Asp 
65 70 75 80 

Asp Pro Thr Arg Lys His Val cys Val Gin val Ala Pro Ala Val Arg 
85 90 95 

Val Ala lie Ala Glu Thr Leu Gly Leu Ala Pro Gly Ala Thr Thr Pro 
100 105 110 

Lys Gin Leu Ala Glu Gly Leu Arg Arg Leu Gly Phe Asp Glu val Phe 
115 120 125 

Asp Thr Leu Phe Gly Ala Asp Leu Thr lie Met Glu Glu Gly Ser Glu 
130 135 140 

Leu Leu His Arg Leu Thr Glu His Leu Glu Ala His Pro His Ser Asp 
145 ~ 150 155 160 

Glu Pro Leu Pro Met Phe Thr Ser Cys Cys Pro Gly Trp lie Ala Met 
165 170 175 

Leu Glu Lys Ser Tyr pro Asp Leu lie Pro Tyr val Ser ser cys Lys 
180 185 190 

ser Pro Gin Met Met Leu Ala Ala Met Val Lys ser Tyr Leu Ala Glu 
195 200 205 

Lys Lys Gly lie Ala Pro Lys Asp Met Val Met val ser lie Met Pro 
210 215 220 

cys Thr Arg Lys Gin Ser Glu Ala Asp Arg Asp Trp Phe cys Val Asp 
225 230 235 240 
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Ala Asp Pro Thr Leu Arg Gin Leu Asp His Val lie Thr Thr Val Glu 
245 ~ 250 255 

Leu Gly Asn lie Phe Lys Glu Arg Gly lie Asn Leu Ala Glu Leu Pro 
260 265 270 

Glu Gly Glu Trp Asp Asn Pro Met Gly val Gly Ser Gly Ala Gly Val 
275 280 285 

Leu Phe Gly Thr Thr Gly Gly Val Met Glu Ala Ala Leu Arg Thr Ala 
290 295 300 

Tyr Glu Leu Phe Thr Gly Thr Pro Leu Pro Arg Leu ser Leu Ser Glu 
305 310 315 320 

Val Arg Gly Met Asp Gly lie Lys Glu Thr Asn lie Thr Met val Pro 
325 330 335 

Ala Pro Gly Ser Lys Phe Glu Glu Leu Leu Lys His Arg Ala Ala Ala 
340 345 350 

Arg Ala Glu Ala Ala Ala His Gly Thr Pro Gly Pro Leu Ala Trp Asp 
355 360 365 

Gly Gly Ala Gly Phe Thr Ser Glu Asp Gly Arg Gly Gly lie Thr Leu 
370 375 380 

Arg val Ala val Ala Asn Gly Leu Gly Asn Ala Lys Lys Leu lie Thr 
385 390 395 400 

Lys Met Gin Ala Gly Glu Ala Lys Tyr Asp Phe val Glu lie Met Ala 
405 410 415 

cys Pro Ala Gly Cys Val Gly Gly Gly Gly Gin Pro Arg ser Thr Asp 
420 425 430 

Lys Ala lie Thr Gin Lys Arg Gin Ala Ala Leu Tyr Asn Leu Asp Glu 
435 440 445 

Lys Ser Thr Leu Arg Arg Ser His Glu Asn Pro Ser lie Arg Glu Leu 
450 455 460 

Tyr Asp Thr Tyr Leu Gly Glu Pro Leu Gly His Lys Ala His Glu Leu 
465 470 475 480 

Leu His Thr His Tyr Val Ala Gly Gly val Glu Glu Lys Asp Glu Lys 
485 490 495 

Lys 



<210> 92 

<211> 581 

<212> PRT 

<213> T. tencongensis 



Page 149 



WO 2005/072262 PCT/US2005/001983 

ir ,, {1 ». 050118 CIP Sequence Listing 

CWW SOI 5 / 0 .1 w » dl 

Met Asp Lys val Arg Val Thr He Asp Gly lie Thr val Glu Val Pro 
15 10 15 

ser Tyr Tyr Thr Val Leu Glu Ala Ala Lys Glu Ala Gly He Asp lie 
20 25 30 

Pro Thr Leu cys Tyr Leu Lys Glu lie Asn Gin lie Gly Ala cys Arg 
35 40 45 

lie cys Leu Val Glu lie Glu Gly Val Arg Asn Leu Gin Thr ser cys 
50 55 60 

Thr Tyr Pro Val Phe Asp Gly Met Lys Val Tyr Thr Asn Thr Pro Lys 
65 70 75 80 

lie Arg Glu Ala Arg Arg Leu Asn Leu Glu Leu lie Leu Ser Asn His 
85 90 95 

Asp Arg Asn cys Leu Thr Cys Val Arg Ser Thr Asn cys Glu Leu Gin 
100 105 110 

Ala Leu Ala Lys Arg Leu Gly Val Glu Glu lie Arg Phe Glu Gly Glu 
115 120 125 

Asn lie Lys Tyr Pro lie Asp Asp Ala Ser Pro Ala Val Val Arg Asp 
130 135 140 

Pro Asn Lys Cys Val Leu Cys Arg Arg cys Val Ala val cys ser Glu 
145 150 155 160 

val Gin Asn Val Phe Ala lie Gly Met Val Asn Arg Gly Phe Lys Thr 
165 170 175 

Met val Ala Pro Ser Phe Gly Arg Ser Leu Lys Asp ser Pro Cys lie 
180 185 190 

Ser Cys Gly Gin cys lie Met Val Cys Pro Val Gly Ala lie Tyr Glu 
195 200 205 

Lys Asp His Thr Lys Arg Val Tyr Glu Ala Leu Ala Asp Asp Lys Lys 
210 215 220 

Tyr val Val Ala Gin Thr Ala Pro Ala val Arg val Ala Leu Gly Glu 

225 230 235 240 

Glu Phe Gly Met Pro Val Gly Thr lie val Thr Gly Lys Met Ala Ala 
245 250 255 

Ala Leu Arg Arg Met Gly Phe Asp Ala val Phe Asp Thr Asn Phe Ala 
260 265 270 

Ala Asp Leu Thr lie Met Glu Glu Gly Ser Glu Leu Leu Glu Arg lie 
275 280 285 

Lys His Gly Gly Lys Leu Pro Met lie Thr ser cys Ser Pro Gly Trp 
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lie Ala Phe Cys Glu Lys Tyr Tyr Pro Glu Phe lie Asp Asn Leu ser 
305 310 315 320 

Thr cys Lys ser Pro His Met Met Met Gly Ala Leu val Lys ser Tyr 
325 330 335 

Tyr Ala Glu Lys Lys Gly Leu Asp Pro Lys Asp lie Phe val Val ser 
340 345 3 50 

lie Met Pro cys Thr Ala Lys Lys Leu Glu lie Glu Arg Glu Glu Met 
355 360 365 

lie Arg Asn Gly Met Lys Asp val Asp Ala Val Leu Thr Thr Arg Glu 
370 375 380 

Leu Ala Arg Met lie Lys Glu Met Gly lie Asp Phe Val Asn Leu Lys 
385 390 395 400 

Asp Glu Glu Phe Asp Glu Pro Leu Gly Met ser Thr Gly Ala Gly Ala 
405 410 415 

He Phe Gly Ala Thr Gly Gly Val Met Glu Ala Ala Leu Arg Thr Val 
420 425 430 

Ala Glu lie val Glu Gly Arg Asp lie Gly Lys lie Asp Phe Glu Glu 
435 440 445 

val Arg Gly Leu Glu Gly val Arg Glu Ala Thr lie Thr lie Asp Gly 
450 455 ~ 460 

Met Asp He Lys lie Ala lie Ala Asn Gly Thr Gly Asn Ala Lys Lys 
465 470 475 480 

Leu Leu Asp Lys val Lys Ala Gly Glu val Glu Tyr His Phe lie Glu 
485 490 495 

val Met Gly cys Pro Gly Gly cys lie Met Gly Gly Gly Gin Pro lie 
500 505 5IL0 

His Asn Pro Asn Glu Met Glu Glu Val Lys Lys Leu Arg Ala Lys Ala 1 
515 520 525 

He Tyr Glu lie Asp Lys Asn Leu Pro lie Arg Lys ser His Glu Asn 
530 535 540 

Pro Ala lie Lys Arg Leu Tyr Glu Glu Phe Leu Gly Tyr Pro Leu ser 
545 550 555 560 

Glu Lys ser His Glu Leu Leu His Thr His Tyr ser Arg Lys Glu Leu 
565 570 575 

Tyr Pro Leu val Lys 
580 
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<212> prt 

<213> N. frontalis 

<400> 93 

Met ser Met Leu ser ser val Leu Asn Lys Ala Val Val Asn Pro Lys 
1 5 10 15 

Leu Thr Arg ser Leu Ala Thr Ala Ala Ala Glu Lys Met Val Asn lie 
20 25 30 

Ser lie Asn Gly Arg Lys Phe Gin Val Lys Pro Lys Thr Thr val Leu 
35 40 45 

Glu Ala Ala Lys Ala Asn Gly Tyr Tyr lie Pro Thr Leu Cys Tyr His 
50 55 60 

Gin Glu Leu Pro Val Ala Gly Asn cys Arg Leu cys Leu val Tyr Ala 
55 70 75 80 

Lys Gly ser Trp Lys Pro Leu Thr Ala cys Thr Thr Glu Val Trp Glu 
85 90 95 K 

Gly Met Glu lie Glu Thr Asp Ser Pro Ala val He Glu Thr Val Arg 
100 105 110 

Ser ser Leu ser Met Met Arg Glu Glu His Pro Asn Asp Cys Met Thr 
115 120 12 5 

Cys 5lX Ser Asn Gly As P Glu phe Gln Asp Leu lie Tyr Arq Tyr 
130 135 140 

Gin He Asp Ala Lys His Pro Val Arg ser Leu Leu Lys His Lys Ser 
145 150 155 160 

Lys Lys Thr Asn His Ser lie Thr Glu Pro Cys Tyr Ser Pro Phe Asp 
165 170 175 

Asn Thr Thr Phe Ser val Ala Arg Asp Met Asn Lys Cys Val Lys cys 
180 185 190 

Gly Arg Cys lie Arg Ala Cys His His Phe Gin Asn lie Asn lie Leu 
195 200 205 

Gly Ile Asn Ar 9 Ala Gly Tyr Glu Arg Val Gly Thr Pro Met Asp 
210 215 220 

Arg pro Met Asn Phe Thr Lys cys val Glu Cys Gly Gin cys ser Gin 
225 230 235 240 

Val cys Pro val Gly Ala He Thr Ala Arg Thr Glu Val val Asp val 
245 250 255 

Leu Arg His Leu Asp Thr Lys Arg Lys val Val Val Cys Ser Thr Ala 
260 265 270 
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Phe Asp Phe Thr Gly Lys Met Val Ala Gly Leu Arg Lys Leu Gly Phe 
290 295 300 



Asp Tyr lie Phe Asp Thr Asn Phe Ser Ala Asp Leu Thr lie Met Glu 
305 310 315 320 



Glu Gly Thr Glu Leu lie Asp Arg Leu Asn Asn Gly Gly Lys Phe Pro 
325 330 335 



Met Phe Thr ser Cys Cys Pro Gly Trp lie Asn Met Val Glu Lys ser 
340 345 350 



Tyr Pro Glu Leu ser Asp Asn Leu ser Ser cys Lys ser Pro Gin Gin 
355 360 365 



Met lie Gly Ala val lie Lys ser Tyr Phe Ala Lys Lys Leu Gly Leu 
370 375 380 



ser Thr Glu Asp lie lie His Val Ser lie Met Pro Cys Thr Ala Lys 
385 390 395 400 



Lys Gly Glu Ala Arg Arg Pro Glu Phe val Gin Lys Gly Lys Asp Gl 
405 410 415 



Lys Asp Tyr Pro Asp lie Asp Tyr Val lie Thr Thr Arg Glu Leu Leu 
420 425 430 



Thr Leu Leu Lys Leu Lys Lys He Asn Pro Ala Glu Leu Pro Asp Asp 
435 440 445 



Lys Phe Asp ser Pro Leu Gly lie Gly Ser Ser Ala Gly Asn Leu Phe 
450 455 460 



Gly val Thr Gly Gly Val Met Glu Ala Ala lie Arg Thr Ala Gin Val 
465 470 475 480 



lie Thr Gly Val Glu Asn Pro lie Pro Leu Gly Glu Leu Lys Ala lie 
485 490 495 



Arg Gly Leu Asp Gly lie Lys Ala Ala Asn val Pro Leu Lys Thr 
500 505 510 



Asp Gly Lys Glu val ser Val Arg Ala Ala val val Ser Gly Gly Ala 
515 520 525 



Asn lie Gin Lys Phe Leu Glu Lys lie Lys Asn Lys Glu Leu Glu Phe 
530 535 540 



Asp Phe lie Glu Met Met Met Cys Pro Gly Gly Cys lie Asn Gly Gly 
545 550 555 560 



Gly Gin Pro Lys Ser Ala Asp Pro Glu lie Val Ala Lys Lys Met Gin 



565 



570 



575 
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Arg Met Tyr Thr Met Asp Asp Gin Ala Lys Leu Arg Leu Cys His Glu 
580 585 ~ 590 

Asn Pro Glu lie lie Asp Val Tyr Lys Asn Phe Leu Gly Glu Pro Asn 
595 600 605 

ser His Leu Ala His Glu Leu Leu His Thr His Tyr Asn Asp Arg ser 
610 615 620 

Lys Thr lie His Asp Met Gly His His Glu Lys Lys 
625 630 635 

<210> 94 

<211> 579 

<212> PRT 

<213> c. thermocellum 

<400> 94 

Met val Asn Val Thr lie Asp Asn cys Lys lie Gin val Pro Ala Asn 
1 5 10 15 

Tyr Thr val Leu Glu Ala Ala Lys Gin Ala Asn lie Asp lie Pro Thr 
20 25 30 

Leu cys Phe Leu Lys Asp lie Asn Glu val Gly Ala cys Arg Met cys 
35 40 45 

val val Glu val Lys Gly Ala Arg ser Leu Gin Ala Ala cys val Tyr 
50 55 ~ 60 

Pro val ser Glu Gly Leu Glu val Tyr Thr Gin Thr Pro Ala val Arg 
65 70 75 80 

Glu Ala Arg Lys Val Thr Leu Glu Leu lie Leu Ser Asn His Glu Lys 
85 90 95 

Lys cys Leu Thr cys Val Arg ser Glu Asn cys Glu Leu Gin Arg Leu 
100 105 110 

Ala Lys Asp Leu Asn Val Lys Asp lie Arg Phe Glu Gly Glu Met Ser 
115 120 125 

Asn Leu Pro lie Asp Asp Leu ser Pro Ser val val Arg Asp Pro Asn 
130 135 140 

Lys cys val Leu cys Arg Arg cys val Ser Met cys Lys Asn val Gin 
145 150 155 160 

Thr Val Gly Ala lie Asp val Thr Glu Arg Gly Phe Arg Thr Thr val 
165 170 175 

Ser Thr Ala Phe Asn Lys Pro Leu ser Glu val Pro Cys Val Asn Cys 
180 185 190 

Gly Gin Cys lie Asn Val Cys Pro Val Gly Ala Leu Arg Glu Lys Asp 
195 200 205 
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Asp lie Asp Lys val Trp Glu Ala Leu Ala Asn Pro Glu Leu His val 
210 215 220 

Val Val Gin Thr Ala Pro Ala Val Arg Val Ala Leu Gly Glu Glu Phe 
225 230 235 240 

Gly Met Pro lie Gly ser Arg val Thr Gly Lys Met val Ala Ala Leu 
245 250 255 

ser Arg Leu Gly Phe Lys Lys val Phe Asp Thr Asp Thr Ala Ala Asp 
260 265 270 

Leu Thr lie Met Glu Glu Gly Thr Glu Leu lie Asn Arg lie Lys Asn 
275 280 285 

Gly Gly Lys Leu Pro Leu lie Thr Ser Cys Ser Pro Gly Trp lie Lys 
290 295 300 

Phe cys Glu His Asn Tyr Pro Glu Phe Leu Asp Asn Leu Ser Ser Cys 
305 310 315 320 

Lys ser Pro His Glu Met Phe Gly Ala val Leu Lys Ser Tyr Tyr Ala 
325 330 335 

Gin Lys Asn Gly lie Asp Pro ser Lys Val Phe val Gly ser lie Met 
340 345 350 

Pro cys Thr Ala Lys Lys Phe Glu Ala Gin Arg Pro Glu Leu ser ser 
355 360 ~ 365 

Thr Gly Tyr pro Asp Val Asp val Val Leu Thr Thr Arg Glu Leu Ala 
370 375 380 

t 

Arg Met He Lys Glu Thr Gly lie Asp Phe Asn Ser Leu Pro Asp Lys 
385 390 395 400 

Gin Phe Asp Asp Pro Met Gly Glu Ala Ser Gly Ala Gly val lie Phe 
405 410 - " 415 

Gly Ala Thr Gly Gly Val Met Glu Ala Ala lie Arg Thr Val Gly Glu 
420 425 ~ 430 

Leu Leu Ser Gly Lys Pro Ala Asp Lys lie Glu Tyr Thr Glu val Arg 
435 440 445 

Gly Leu Asp Gly lie Lys Glu Ala Ser lie Glu Leu Asp Gly Phe Thr 
450 455 460 

Leu Lys Ala Ala Val Ala His Gly Leu Gly Asn Ala Arg Lys Leu Leu 
465 470 475 480 

Asp Lys lie Lys Ala Gly Glu Ala Asp Tyr His Phe lie Glu lie Met 
485 490 495 

Ala Cys Pro Gly Gly Cys lie Asn Gly Gly Gly Gin Pro lie Gin Pro 
500 505 510 
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ser ser val Arg Asn Trp Lys Asp lie Arg cys Glu Arg Ala Lys Ala 
515 520 525 

lie Tyr Glu Glu Asp Glu ser Leu Pro lie Arg Lys Ser His Glu Asn 
530 535 540 

Pro Lys lie Lys Met Leu Tyr Glu Glu Phe Phe Gly Glu Pro Gly ser 
545 550 555 560 

His Lys Ala His Glu Leu Leu His Thr His Tyr Glu Lys Arg Glu Asn 
565 570 575 

Tyr Pro Val 



<210> 95 
<211> 588 
<212> PRT 

<213> B . thetaoi micron 
<400> 95 

Met Glu Glu Lys Gin lie Thr Leu Gin lie Asp Gly His Phe lie Thr 
15 10 15 

val Pro Glu Gly ser Thr lie Leu Glu Ala Ala cys Lys lie Gly lie 
20 25 30 

Asn lie Pro Thr Leu Cys His lie Asp Leu Lys Gly Thr cys lie Lys 
35 '40 45 

Asn Asn Pro Ala Ser Cys Arg lie cys Val val Glu Val Ala Gly Arg 
50 55 60 

Arg Asn Leu Ala Pro Ala Cys Ala Thr Arg Cys Thr Glu Gly Met Val 
65 70 75 80 

Val Lys Thr ser Thr Leu Arg Val Met Asn Ala Arg Lys val Val Ala 
85 90 95 

Glu Leu lie Leu ser Asp His Pro Asn Asp Cys Leu Thr Cys Pro Lys 
100 105 110 

cys Gly Asn cys Glu Leu Gin Thr Leu Ala Leu Arg Phe Asn lie Arg 
115 120 125 

Glu Met Pro Phe Asn Gly Gly Glu Leu Ser Pro Arg Lys Arg Glu val 
130 135 140 

Thr ser ser lie Val Arg Asn Met Asp Lys Cys lie Phe Cys Arg Arg 
145 150 155 160 

cys Glu ser Val cys Asn Asp Val Gin Thr val Gly Ala Leu Gly Ala 
165 170 175 

lie Arg Arg Gly Phe Asn Thr Thr lie Ala Pro Ala Phe Asp Arg Met 
180 185 190 
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Met Lys Asp ser Glu cys Thr Tyr cys Gly Gin cys Val Ala val Cys 
195 200 205 

Pro val Gly Ala Leu Thr Glu Arg Asp Tyr Thr Asn Arg Leu Leu Asp 
210 215 220 

Asp Leu Ala Asp Pro Asp Lys lie Val lie Val Gin Thr Ala Pro Ala 
225 230 235 240 

Val Arg Ala Ala Leu Gly Glu Glu Phe Gly Leu Pro Pro Gly Thr Leu 
245 250 255 

Val Thr Gly Lys Met Val Tyr Ala Leu Arg Glu Leu Gly Phe Asp Tyr 
260 265 270 

val Phe Asp Thr Asp Phe Ala Ala Asp Leu Thr lie Met Glu Glu Gly 
275 280 285 

290 11 6 APQ 295 300 ^ ^ 

val Arg Leu Pro lie Leu Thr ser cys Cys Pro Ala Trp val Asn Phe 
305 310 315 320 

Phe Glu His His Phe Pro Asp Met Leu Asp lie Pro Ser Thr Ala Arg 
325 330 335 

Ser Pro Gin Gin Met Phe Gly ser lie Ala Lys Ser Tyr Trp Ala Glu 
340 345 350 

Lys Met Gly lie Pro Arg Glu Lys Leu val Val val ser lie Met Pro 
355 360 365 

cys Leu Ala Lys Lys Tyr Glu cys Asp Arg Asp Glu Phe Lys Val Asn 
370 375 380 

Gly val Pro Asp val Asp Tyr Ser lie Ser Thr Arg Glu Leu Ala Arg 
385 390 395 400 

Leu lie Lys Arg Ala Asn lie Gly Phe Thr Leu Val Leu Asp ser Pro 
405 410 415 

Phe Asp Asn Pro Met Gly Glu ser Thr Gly Ala Gly Val lie Phe Gly 
420 425 430 

Thr Thr Gly Gly val Met Glu Ala Ala Leu Arg Ser Val Tyr Glu lie 
435 440 ~ 445 

Tyr Thr Gly Gin Pro Leu Lys Asn Val Asn Phe Glu Gin Val Arg Gly 
450 455 460 

Leu Ser Gly val Arg Arg Ala Thr lie Asp Leu Asn Gly Phe Glu Leu 
465 470 475 480 

Lys val Gly lie Ala His Gly Leu Gly Asn Ala Arg His Leu Leu Glu 
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490 



495 



Asp lie Arg Asn Gly His Asn Glu Tyr His Val lie Glu lie Met Ala 
500 505 510 



Cys Pro Gly Gly Cys lie Gly Gly Gly Gly Gin Pro Leu His His Gly 
515 520 525 



Asn Ser Asp Val Leu Tyr Ala Arg Ala Asn Ala Leu Tyr Arg Glu Asp 
530 535 540 



Ala Asn Lys Pro Leu Arg Lys Ser His Asp Asn Pro Tyr lie Gin Lys 
545 550 555 560 



Leu Tyr Glu Glu Tyr Leu Gly Lys Pro Leu Gly Glu Lys ser Glu Met 
565 570 575 



Leu Leu His Thr His Tyr Phe Asn Lys Ser lie Asp 
580 585 



<210> 96 
<211> 585 
<212> PRT 

<213> D. f ructosovorans 
<400> 96 

Met ser Met Leu Thr lie Thr lie Asp Gly Lys Thr Thr ser val Pro 
15 10 15 



Glu Gly Ser Thr lie Leu Asp Ala Ala Lys Thr Leu Asp lie Asp He 
20 25 30 



Pro Thr Leu cys Tyr Leu Asn Leu Glu Ala Leu ser lie Asn Asn Lys 
35 40 45 



Ala Ala ser cys Arg Val Cys val Val Glu Val Glu Gly Arg Arg Asn 
50 55 60 



Leu Ala Pro Ser cys Ala Thr Pro Val Thr Asp Asn Met val val Lys 
65 70 75 80 



Thr Asn ser Leu Arg val Leu Asn Ala Arg Arg Thr Val Leu Glu Leu 
85 90 95 



Leu Leu ser Asp His Pro Lys Asp cys Leu val Cys Ala Lys Ser Gly 
100 105 110 



Glu cys Glu Leu Gin Thr Leu Ala Glu Arg Phe Gly lie Arg Glu ser 
115 120 125 



Pro Tyr Asp Gly Gly Glu Met Ser His Tyr Arg Lys Asp lie ser Ala 
130 135 140 



Ser lie lie Arg Asp Met Asp Lys Cys lie Met Cys Arg Arg cys Glu 
145 150 155 160 



Thr Met cys Asn Thr Val Gin Thr cys Gly val Leu Ser Gly val Asn 
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170 



175 



Arg Gly Phe Thr Ala Val val Ala Pro Ala Phe Glu Met Asn Leu Ala 
180 185 190 



Asp Thr val cys Thr Asn Cys Gly Gin cys Val Ala Val cys Pro Thr 
195 200 205 



Gly Ala Leu Val Glu His Glu Tyr lie Trp Glu Val Val Glu Ala Leu 
210 215 220 



Ala Asn Pro Asp Lys Val Val lie Val Gin Thr Ala Pro Ala Val Arg 
225 230 235 240 



Ala Ala Leu Gly Glu Asp Leu Gly Val Ala Pro Gly Thr Ser Val Thr 
245 250 255 



Gly Lys Met Ala Ala Ala Leu Arg Arg Leu Gly Phe Asp His Val Phe 
260 265 270 



Asp Thr Asp Phe Ala Ala Asp Leu Thr lie Met Glu Glu Gly Ser Glu 
275 280 285 



Phe Leu Asp Arg Leu Gly Lys His Leu Ala Gly Asp Thr Asn val Lys 
290 295 300 



Leu Pro lie Leu Thr Ser Cys Cys Pro Gly Trp Val Lys Phe Phe Glu 
305 310 315 320 



His Gin Phe Pro Asp Met Leu Asp Val Pro Ser Thr Ala Lys ser Pro 
325 330 335 



Gin Gin Met Phe Gly Ala lie Ala Lys Thr Tyr Tyr Ala Asp Leu Leu 
340 " 345 350 



Gly lie Pro Arg Glu Lys Leu Val val Val Ser val Met Pro Cys Leu 
355 360 365 



Ala Lys Lys Tyr Glu cys Ala Arg Pro Glu Phe Ser Val Asn Gly Asn 
370 375 380 



Pro Asp val Asp lie Val lie Thr Thr Arg Glu Leu Ala Lys Leu Val 
385 390 395 400 



Lys Arg Met Asn lie Asp Phe Ala Gly Leu Pro Asp Glu Asp Phe Asp 
405 410 415 



Ala Pro Leu Gly Ala Ser Thr Gly Ala Ala Pro lie Phe Gly Val Thr 
420 425 430 



Gly Gly Val lie Glu Ala Ala Leu Arg Thr Ala Tyr Glu Leu Ala Thr 
435 440 ~ 445 



Gly Glu Thr Leu Lys Lys Val Asp Phe Glu Asp Val Arg Gly Met Asp 



450 



455 



460 
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pr-G*% t ,va| FL^7Kt.y,S{]31:fefiilB 3al Lys Val Gly Asp Asn Glu Leu Val lie 
"465 470 475 480 

Gly Val Ala His Gly Leu Gly Asn Ala Arg Glu Leu Leu Lys Pro cys 
485 490 495 

Gly Ala Gly Glu Thr Phe His Ala lie Glu Val Met Ala cys Pro Gly 
500 505 510 

Gly cys lie Gly Gly Gly Gly Gin pro Tyr His His Gly Asp Val Glu 
515 520 525 

Leu Leu Lys Lys Arg Thr Gin Val Leu Tyr Ala Glu Asp Ala Gly Lys 
530 535 540 

Pro Leu Arg Lys Ser His Glu Asn Pro Tyr lie lie Glu Leu Tyr Glu 
545 550 555 560 

Lys Phe Leu Gly Lys Pro Leu ser Glu Arg ser His Gin Leu Leu His 
565 570 575 

Thr His Tyr Phe Lys Arg Gin Arg Leu 
580 585 

<210> 97 
<211> 606 
<212> PRT 
<213> D. vulgaris 

<400> 97 

Met Asn Ala Phe lie Asn Gly Lys Glu val Arg Cys Glu Pro Gly Arg 
15 10 15 

Thr lie Leu Glu Ala Ala Arg Glu Asn Gly His Phe lie Pro Thr Leu 
20 ~ 25 30 

cys Glu Leu Ala Asp lie Gly His Ala Pro Gly Thr Cys Arq val Cys 
35 40 45 

Leu val Glu lie Trp Arg Asp Lys Glu Ala Gly Pro Gin lie val Thr 
50 55 60 

Ser cys Thr Thr Pro Val Glu Glu Gly Met Arg lie Phe Thr Arg Thr 
65 70 75 80 

Pro Glu val Arg Arg Met Gin Arg Leu Gin Val Glu Leu Leu Leu Ala 
85 90 95 

Asp His Asp His Asp cys Ala Ala cys Ala Arg His Gly Asp Cys Glu 
100 105 110 

Leu Gin Asp val Ala Gin Phe Val Gly Leu Thr Gly Thr Arq His His 
115 120 125 

Phe Pro Asp Tyr Ala Arg Ser Arg Thr Arg Asp Val ser ser Pro ser 
130 135 140 
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Yfl^^ CVS He Arg cys Leu Arg Cys Val Ala 

145 150 155 ~ 160 

val Cys Arg Asn Val Gin Gly val Asp Ala Leu Val val Thr Gly Asn 
165 170 175 

Gly lie Gly Thr Glu lie Gly Leu Arg His Asn Arg ser Gin ser Ala 
180 185 190 

Ser Asp cys val Gly cys Gly Gin cys Thr Leu val cys Pro val Gly 
195 200 205 

Ala Leu Ala Gly Arg Asp Asp Val Glu Arg Val lie Asp Tyr Leu Tvr 
210 215 220 

Asp Pro Glu lie val Thr val Phe Gin Phe Ala Pro Ala val Arg Val 
225 230 235 240 

Gly Leu Gly Glu Glu Phe Gly Leu Pro Pro Gly Ser Ser Val Glu Gly 
245 250 255 

Gin val Pro Thr Ala Leu Arg Leu Leu Gly Ala Asp val val Leu Asp 
260 265 270 

Thr Asn Phe Ala Ala Asp Leu Val lie Met Glu Glu Gly Thr Glu Leu 

275 280 285 

Leu Gin Arg Leu Arg Gly Gly Ala Lys Leu Pro Leu Phe Thr ser Cys 
290 295 300 

Cys Pro Gly Trp Val Asn Phe Ala Glu Lys His Leu Pro Asp lie Leu 
305 310 315 320 

Pro His Val ser Thr Thr Arg ser Pro Gin Gin cys Leu Gly Ala Leu 
325 330 335 

Ala Lys Thr Tyr Leu Ala Arg Thr Met Asn val Ala Pro Glu Arg Met 
340 345 350 

Arg val val ser Leu Met Pro cys Thr Ala Lys Lys Glu Glu Ala Ala 
355 360 365 

Arg Pro Glu Phe Arg Arg Asp Gly Val Arg Asp Val Asp Ala val Leu 
370 375 380 

Thr Thr Arg Glu Phe Ala Arg Leu Leu Arg Arg Glu Gly lie Asp Leu 
385 390 395 400 

Ala Gly Leu Glu Pro ser Pro Cys Asp Asp Pro Leu Met Gly Arq Ala 
405 410 415 

Thr Gly Ala Ala Val lie Phe Gly Thr Thr Gly Gly val Met Glu Ala 
420 425 430 

Ala Leu Arg Thr val Tyr His Val Leu Asn Gly Lys Glu Leu Ala Pro 
435 440 445 
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V4T gTu ""Leu His Ala Leu Arg Gly Tyr Glu Asn Val Arg Glu Ala val 
450 455 460 

val Pro Leu Gly Glu Gly Asn Gly Ser Val Lys Val Ala Val val His 
465 470 475 480 

Gly Leu Lys Ala Ala Arg Gin Met Val Glu Ala Val Leu Ala Gly Lys 
485 490 495 

Ala Asp His val Phe val Glu Val Met Ala Cys Pro Gly Gly Cys Met 
500 505 510 

Asp Gly Gly Gly Gin Pro Arg Ser Lys Arg Ala Tyr Asn Pro Asn Ala 
515 520 525 

Gin Ala Arg Arg Ala Ala Leu Phe ser Leu Asp Ala Glu Asn Ala Leu 
530 535 540 

Arg Gin ser His Asn Asn Pro Leu lie Gly Lys val Tyr Glu ser Phe 
545 550 555 560 

Leu Gly Glu Pro cys Ser Asn Leu Ser His Arg Leu Leu His Thr Arg 
565 570 575 

Tyr Gly Asp Arg Lys ser Glu Val Ala Tyr Thr Met Arg Asp lie Trp 
580 585 590 

His Glu Met Thr Leu Gly Arg Arg val Arg Gly Asp Ser Asp 
595 600 " 605 

<210> 98 

<211> 589 

<212> PRT 

<213> T. vaginalis 

<400> 98 

Ala ser Thr Gly He Asn Ser Thr Ala Asn lie Leu Arg Asn lie Thr 
15 10 15 

val Thr val Asn Gly Lys Pro Leu Glu Ala Lys Lys Gly Glu Thr val 
20 25 30 

Leu Glu Leu Cys Asp Arg Asn Asn lie Arg lie Pro Arg Leu cys Phe 
35 40 45 

His Pro Asn Leu Pro Pro Lys Ala Ser cys Arg Val Cys Leu val Glu 
50 55 60 

cys Asp Gly Lys Trp Leu ser Pro Ala cys Val Thr Thr val Trp Asp 
65 70 75 80 

Gly Leu Lys lie Asp Thr Lys Ser Lys Asn Val Arg Asp ser Val Glu 
85 90 95 

Asn Asn Leu Lys Glu Leu Leu Asp cys His Asp Glu Thr Cys ser Ala 
100 105 110 
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«cys ri'6' H, 'AT^Xsri , "* , M1V , A"rg Cys Gin Phe Arg Asp Met Asn Val Ala Tyr 
115 120 125 

ser val Lys Ala Glu Thr Lys Glu lie cys ser Glu Glu Gly lie Asp 
130 135 140 

Glu ser Thr Asn Ala lie Arg Leu Asp Thr Ser Lys cys val Leu Cys 
145 150 155 160 

Gly Arg Cys lie Arg Ala Cys Glu Glu Val Ala Gly Thr Ser Ala lie 
165 170 175 

lie Phe Gly Asn Arg Ala Lys Lys Met Arg lie Gin Pro Thr Phe Gly 
180 185 190 

Val Thr Leu Gin Glu Thr Ser Cys lie Lys cys Gly Gin cys Thr Leu 
195 200 205 

Tyr cys Pro val Gly Ala He Thr Glu Lys Ser Gin val Lys Glu Ala 
210 215 220 

Leu Asp lie Leu Ala Asn Lys Gly Lys Lys lie Thr Val val Gin Val 
225 230 235 240 

Ala Pro Ala Val Arg val Ala Leu Ser Glu Ala Phe Gly Tyr Lys Glu 
245 250 255 

Gly Thr val Thr Thr Gly Lys Met val ser Ala Leu Lys Ala Leu Gly 
260 265 270 

Phe Asp Leu Val Tyr Asp Thr Asn Tyr Gly Ala Asp Leu Thr lie Cys 
275 280 285 

Glu Glu Ala Gly Glu Leu val Asn Arg Leu Arg Asp Pro Asn Ala Lys 
290 295 300 

Phe Pro Met Phe Thr Thr cys Cys Pro Ala Trp Val Asn Tyr val Glu 
305 310 315 320 

Gin Ser Ala Pro Asp Phe lie Pro Asn Leu Ser Ser Cys Arg ser Pro 
325 330 335 

Gin Gly Met Leu Ser Ala Leu lie Lys Asn Tyr Leu Pro Lys Leu Leu 
340 345 350 

Asp val Lys Gin Glu Asp val Leu Asn Phe ser lie Met Pro cys Thr 
355 360 365 

Ala Lys Lys Asp Glu Val Glu Arg Pro Glu Leu Arg Thr Lys Ser Gly 
370 375 380 

Leu Lys Glu Thr Asp Met Val Leu Thr Val Arg Glu Leu Val Glu Met 
385 390 395 400 

lie Lys Leu Ser Asn lie Asp Phe Asn Asn Leu Pro Asp Thr Gin Phe 
405 410 415 
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Asp Asn He Phe Gly Phe Gly ser Gly Ala Gly Gin He Phe Ala Ala 
420 425 430 

Thr Gly Gly Val Met Glu Ala Ala Ser Arg Thr Ala Phe Glu Val Tyr 
435 440 445 

Thr Gly Lys Lys Leu Thr Asn Val Asn lie Tyr Pro val Arg Gly Met 
450 455 460 

Asp Gly Leu Arg lie Ala Glu Leu Asp Leu Asp Gly Thr Lys Leu Lys 
465 470 475 480 

Val Ala val cys His Gly He Ala Asn Thr Ala Lys Leu Leu Asp Arq 
485 490 495 

Leu Arg Glu Lys Asp Pro Glu Leu Met Asp lie Lys Phe lie Glu lie 
500 505 510 

Met Ala Cys Pro Gly Gly cys val cys Gly Gly Gly Thr Pro Gin Pro 
515 520 525 

Lys Asn Arg Val ser Leu Asp Asn Arg Leu Ala Ala lie Tyr Asn He 
530 535 540 

Asp Ala Lys Met Glu Cys Arg Lys ser His Glu Asn Pro Leu lie Lys 
545 550 555 560 

Gly val Tyr Lys Glu Phe Leu Gly Lys Pro Asn ser His Leu Ala His 
565 570 575 

Glu Leu Leu His Thr His Phe Lys His His Pro Lys Trp 
580 585 

<210> 99 
<211> 1206 
<212> PRT 

<213> Nyctotherus oval is 
<400> 99 

Met lie ser Arg Leu lie Ala Lys Lys Ala Pro Leu Phe Leu Arg Thr 
1 5 10 15 

Phe Ala Thr Ser Glu Met lie Ser Leu Lys lie Asp Gly Lys lie lie 
20 25 30 

ser val Pro Lys Gly lie Met Leu Ala Asp Ala lie Lys Lys Ala Gly 
35 40 45 

Ala Asn val Pro Thr Met cys Tyr His Pro Asp Leu Pro Thr ser Gly 
50 55 60 

Gly lie cys Arg Val cys Leu Val Glu ser Ala Lys Ser Pro Gly Tyr 
65 70 75 80 

Pro lie lie Ser Cys Arg Thr Pro val Glu Glu Gly Met Glu lie Val 
85 90 95 
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Thr Gin Gly Ser Lys Met Lys Glu Tyr Arg Gin Ala Asn Leu Ala Leu 
100 105 110 

Met Leu Ser Arg His Pro Asn Ala Cys Leu ser cys Thr ser Asn Thr 
115 120 125 

Asn Cys Lys Thr Gin Glu Leu Ser Ala Asn Met Asn lie Gly Gin Cys 
130 135 140 

Gly Phe Ala Asn Ala Thr Pro Pro Lys Asn Asp Asp Ser Tyr Asp Met 
145 150 155 160 

Thr Thr Ala lie Glu Arg Asp Asn Asp Lys Cys lie Asn cys Asp lie 
165 ~ 170 175 

cys val His Thr Cys Ser Leu Gin Gly Leu Asn Ala Leu Gly Phe Tyr 
180 185 190 

Asn Glu Glu Gly His Ala Val Lys Ser Met Gly Thr Leu Asp val Ser 
195 200 205 

Glu Cys lie Gin Cys Gly Gin Cys lie Asn Arg cys Pro Thr Gly Ala 
210 215 220 

lie Thr Glu Lys Ser Glu lie Arg Pro Val Leu Asp Ala lie Asn lie 
225 230 " 235 240 

Gin Gin Arg Leu val Phe Gin Met Ala Pro Ser lie Arg Val Ala Val 
245 250 255 

Ala Glu Glu Phe Gly lie Lys Pro Gly Glu Lys lie Leu Lys Asn Glu 
260 265 270 

lie Ala Thr Ala Leu Arg Lys Leu Gly Ser Asn Val Phe val Leu Asp 
275 280 285 

Thr Asn Phe Ser Ala Asp Leu Thr lie lie Glu Glu Gly His Glu Leu 
290 295 300 

lie Glu Arg Leu Tyr Arg Asn Val Thr Gly Lys Lys Leu Leu Gly Gly 
305 310 315 320 

Asp His Met Pro lie Asp Leu Pro Met Leu Thr Ser cys Cys Pro Gly 
325 330 335 

Trp lie Met Phe lie Glu Lys Asn Tyr Pro Asp Leu Leu Asn Asn Leu 
340 345 350 

ser Thr cys Lys Ser Pro Gin Gly Met Leu Gly Ala Leu lie Lys Gly 
355 360 365 

Tyr Trp Ala Lys Asn lie Lys Lys Met Asp Pro Lys Asp lie Val Ser 
370 375 380 

Val ser lie Met Pro cys Thr Ala Lys Lys Ala Glu Lys Glu Arg Pro 
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3? 5/U 5 O !b O :l Wi)d! 395 400 

Gin Leu Arg Gly Asp Glu Gly Tyr Lys Asp val Asp Tyr lie Leu Thr 
405 410 415 

Thr Arg Glu Leu Ala Lys Met Leu Lys Gin Ser Asn lie Asp Leu Ala 
420 425 430 

Lys Met Glu Pro Thr Pro Phe Asp Lys Val Met Ser Glu Gly Thr Gly 
435 440 445 

Ala Ala val lie Phe Gly val Thr Gly Gly Val Met Glu Ala Ala Leu 
450 455 460 

Arg Thr Ala Asn Glu val lie Thr Gly Arg Glu val Pro Phe Lys Asn 
465 470 475 480 

Leu Asn lie Glu Ala Val Arg Gly Met Glu Gly lie Arg Glu Ala Gly 
485 490 495 

lie Lys Leu Glu Asn Val Leu Asp Lys Tyr Lys Ala Phe Glu Gly val 
500 505 510 

Thr val Lys Val Ala lie Ala His Gly Pro Asn Asn Ala Arg Lys val 
515 520 525 

Met Asp lie lie Lys Gin Ala Lys Glu Ser Gly Lys Pro Ala Pro Trp 
530 535 540 

His Phe val Glu Val Met Ala cys Pro Gly Gly cys lie Gly Gly Gly 
545 550 555 560 

Gly Gin Pro Lys Pro Thr Asn Leu Glu lie Arg Gin Ala Arg Thr Gin 
565 570 ~ 575 

Leu Thr Phe Lys Glu Asp Met Asp Leu Pro Leu Arg Lys Ser His Asp 
580 585 590 

Asn Pro Glu lie Lys Ala lie Tyr Glu Asn Tyr Leu Lys Glu Pro Leu 
595 600 605 

Gly His Asn Ser His His Tyr Leu His Thr Thr Tyr Ser Ser Gin Lys 
610 615 620 

Val Arg Asp Met Asn Leu Tyr Asn Ala Asn Glu Ala Ala Gly Leu Asp 
625 630 635 640 

Glu lie Leu Ala Lys Tyr Pro Lys Glu Lys Glu Tyr Leu Met Pro lie 
645 650 655 

lie lie Glu Glu His Asp Lys Lys Gly Tyr lie Ser Asp Pro ser lie 
660 665 670 

Val Lys lie ser Glu His Leu Gly Met Tyr Pro Ala Gin lie Glu Ser 
675 680 685 
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S¥.,*§g SO m± fHB 3|r Phe Pro Arg Glu His Thr lie Ala He 

Leu Met ser He cys val His cys His Asn cys Met Met Lys Gly Gin 
705 710 715 720 

Gly Arg Leu Leu Lys Thr lie Gin Glu Thr Tyr Asp He His Glu Thr 
725 730 735 

His Gly Gly val Ala Lys Asp Gly ser Phe Thr Leu His Thr Leu Asn 
740 745 750 

Trp Leu Gly Tyr Cys Val Asn Asp Ala Pro Ala Met Met lie Lys Arq 
755 760 765 

Lys Gly Thr Asn Tyr val Glu Thr phe Thr Gly Leu Leu Gly Asp Asn 
770 775 780 

lie Asp Gin Arg Leu Lys Ser Leu Lys Asn Leu Lys Lys Glu Leu Pro 
785 790 795 800 

Lys Trp Pro Lys Asn Asn lie Arg Glu Met Lys ser Gin Arg Asn Gly 
805 810 815 

Asn ser Tyr ser cys Met Asn Thr Gin Ala Pro lie Ala Glu Ala Thr 
820 825 830 

Lys Lys Ala Val Ser Met Gly Pro Glu Lys val lie Glu Glu val Phe 
835 840 845 

Lys ser Asn Leu Val Gly Arg Gly Gly Ala Gly Phe Arg Thr Gly Lys 
850 855 860 

Lys Trp Glu Ser Ala Tyr Lys Thr Pro Ala Ser Asp Lys Tyr Val Val 
865 870 875 880 

Cys Asn Ala Asp Glu Gly Leu Pro ser Thr Tyr Lys Asp Trp cys Leu 
885 890 895 

Leu Asn Asn Glu Ala Lys Arg Lys Glu val Phe Thr Gly Met Gly lie 
900 905 910 

Cys Ala Lys Thr lie Gly Ala Lys Arg Cys Phe Met Tyr Leu Arg Tyr 
915 920 925 

Glu Tyr Arg Asn Leu Val Pro Ala Leu Glu Gin Ser lie Lys Asp Val 
930 935 940 

Gin Ser Thr Cys Pro Glu Leu Ala Asp Leu Lys Tyr Glu lie Arg Leu 
945 950 955 960 

Gly Gly Gly pro Tyr val Ala Gly Glu Glu Asn Ala Gin Phe Glu Ser 
965 970 975 

lie Glu Gly Arg Ala pro Leu Pro Arg Lys Asp Arg pro Gly Asn lie 
980 985 990 
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Phe Pro Thr Met Glu Gly Leu Phe His Lys Pro Thr val lie Asn Asn 
995 1000 1005 

val Glu Thr Phe Phe Ala lie Pro His lie lie Gin Gin Gly Ser 
1010 1015 1020 

Gin Ser Phe Gly Glu Gly Lys Met Pro Lys Leu Leu Ser Val Thr 
1025 1030 1035 

Gly Asp val Asp Glu Pro lie Leu lie Glu Thr Asn Leu Asn Asn 
1040 1045 1050 

Tyr ser Leu Asn His Leu Leu Gin Glu lie Ser Ala Lys Asp lie 
1055 1060 1065 

Val Ala Ala Glu lie Gly Gly cys Thr Glu Pro lie lie Phe Gly 
1070 1075 1080 

Ser Lys Phe Asp Thr Leu Phe Gly Phe Gly Arg Gly Thr Leu Asn 
1085 1090 1095 

Ala val Gly ser val val Leu Phe Asn ser Ser cys Asp Leu Gly 
1100 1105 1110 

Lys lie Tyr Glu Asn Lys Leu Lys Phe Met Ala Glu Glu Ser cys 
1115 1120 1125 

Lys Gin cys val Pro cys Arg Asp Gly Ser Tyr lie Phe His Arg 
1130 , 1135 1140 

Ala Phe Lys Glu Leu Arg Asp Thr Gly Lys Ser Ser Tyr Asn Met 
1145 1150 1155 

Arg Ala Leu Ala Val Ala Ser Glu Ser Ala Ala Arg ser Ser lie 
1160 1165 1170 

cys Ala His Gly Lys Ala Leu Glu Ser Leu Phe Lys Ser Ala Cys 
1175 1180 1185 

Asp Phe Met Asn Lys Thr Lys Pro lie Tyr Gin Pro His Ser Thr 
1190 1195 1200 

Tyr His Gin 
1205 

<210> 100 

<211> 468 

<212> PRT 

<213> T vaginalis 

<400> 100 

Met Leu Ala Ser Ser Ala Thr Ala Met Lys Gly Phe Ala Asn Ser Leu 
15 10 15 

Arg Met Lys Asp Tyr Ser Ser Thr Gly lie Asn Phe Asp Met Thr Lys 
20 25 30 
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Cys fie "Asn cys Gin ser cys Val Arg Ala Cys Thr Asn lie Ala Gly 
35 40 45 

Gin Asn Val Leu Lys Ser Leu Thr val Asn Gly Lys Ser Val Val Gin 
50 55 60 

Thr Val Thr Gly Lys Pro Leu Ala Glu Thr Asn cys lie ser cys Gly 
65 70 75 80 

Gin Cys Thr Leu Gly cys Pro Lys Phe Thr lie Phe Glu Ala Asp Ala 
85 90 95 

lie Asn Pro val Lys Glu Val Leu Thr Lys Lys Asn Gly Arg lie Ala 
100 105 110 

val cys Gin lie Ala Pro Ala lie Arg He Asn Met Ala Glu Ala Leu 
115 120 ~ 125 

Gly Val pro Ala Gly Thr lie Ser Leu Gly Lys val val Thr Ala Leu 
130 135 140 

Lys Arg Leu Gly Phe Asp Tyr Val Phe Asp Thr Asn Phe Ala Ala Asp 
145 150 155 160 

Met Thr lie Val Glu Glu Ala Thr Glu Leu val Gin Arg Leu ser Asp 
165 170 175 

Lys Asn Ala Val Leu Pro Met Phe Thr Ser Cys Cys Pro Ala Trp Val 
180 185 190 

Asn Tyr val Glu Lys ser Asp Pro Ser Leu lie Pro Tyr Leu Ser Ser 
195 200 205 

cys Arg Ser Pro Met ser Met Leu Ser ser val lie Lys Asn Val Phe 
210 215 220 

Pro Lys Lys lie Gly Thr Thr Ala Asp Lys lie Tyr Asn val Ala lie 
225 230 235 240 

Met Pro Cys Thr Arg Lys Lys Asp Glu lie Gin Arg ser Gin Phe Thr 
245 250 255 

Met Lys Asp Gly Lys Gin Glu Thr Gly Ala Val Leu Thr Ser Arg Glu 
260 265 270 

Leu Ala Lys Met lie Lys Glu Ala Lys lie Asn Phe Lys Glu Leu Pro 
275 280 285 

Asp Thr pro Cys Asp Asn Phe Tyr ser Glu Ala ser Gly Gly Gly Ala 
290 295 300 

lie Phe cys Ala Thr Gly Gly val Met Glu Ala Ala val Arg ser Ala 
305 310 315 ^ 320 

Tyr Lys Phe Leu Thr Lys Lys Glu Leu Ala Pro lie Asp Leu Gin Asp 
325 330 335 
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val Arg Gly Val Ala Ser Gly Val Lys Leu Ala Glu val Asp He Ala 
340 345 350 

Gly Thr Lys val Lys val Ala val Ala His Gly lie Lys Asn Ala Met 
355 360 365 

Thr Leu lie Lys Lys lie Lys Ser Gly Glu Glu Gin Phe Lys Asp val 
370 375 380 

Lys Phe Val Glu val Met Ala Cys Pro Gly Gly Cys Val Val Gly Gly 
385 390 395 400 

Gly ser Pro Lys Ala Lys Thr Lys Lys Ala val Gin Ala Arg Leu Asn 
405 410 415 

Ala Thr Tyr ser lie Asp Lys ser ser Lys His Arg Thr Ser Gin Asp 
420 425 430 

Asn Pro Gin Leu Leu Gin Leu Tyr Lys Glu ser Phe Glu Gly Lys Phe 
435 440 445 

Gly Gly His val Ala His His Leu Leu His Thr His Tyr Lys Asn Arg 
450 455 460 

Lys val Asn Pro 
465 

<210> 101 
<211> 582 
<212> PRT 

<213> C. acetobutylicum 
<400> 101 

Met Lys Thr lie lie Leu Asn Gly Asn Glu Val His Thr Asp Lys Asp 
15 10 15 

lie Thr lie Leu Glu Leu Ala Arg Glu Asn Asn val Asp lie Pro Thr 
20 25 30 

Leu Cys Phe Leu Lys Asp Cys Gly Asn Phe Gly Lys Cys Gly Val Cys 
35 40 45 

Met val Glu val Glu Gly Lys Gly Phe Arg Ala Ala Cys Val Ala Lys 
50 55 60 

val Glu Asp Gly Met val lie Asn Thr Glu Ser Asp Glu val Lys Glu 
65 70 75 80 

Arg lie Lys Lys Arg Val ser Met Leu Leu Asp Lys His Glu Phe Lys 
85 90 95 

Cys Gly Gin cys ser Arg Arg Glu Asn Cys Glu Phe Leu Lys Leu Val 
100 105 110 

lie Lys Thr Lys Ala Lys Ala ser Lys Pro Phe Leu Pro Glu Asp Lys 
115 120 125 
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Asp Ala Leu val Asp Asn Arg Ser Lys Ala He Val lie Asp Arg ser 
130 135 140 

Lys Cys Val Leu Cys Gly Arg Cys val Ala Ala Cys Lys Gin His Thr 
145 150 ~ 155 160 

Ser Thr Cys Ser lie Gin Phe lie Lys Lys Asp Gly Gin Arg Ala Val 
165 170 175 

Gly Thr Val Asp Asp Val Cys Leu Asp Asp Ser Thr Cys Leu Leu Cys 
180 185 190 

Gly Gin cys val lie Ala Cys Pro val Ala Ala Leu Lys Glu Lys ser 
195 200 205 

His lie Glu Lys val Gin Glu Ala Leu Asn Asp Pro Lys Lys His Val 
210 215 220 

lie val Ala Met Ala Pro Ser Val Arg Thr Ala Met Gly Glu Leu Phe 
225 230 235 240 

Lys Met Gly Tyr Gly Lys Asp Val Thr Gly Lys Leu Tyr Thr Ala Leu 
245 250 255 

Arg Met Leu Gly Phe Asp Lys Val Phe Asp lie Asn Phe Gly Ala Asp 
260 265 270 

Met Thr lie Met Glu Glu Ala Thr Glu Leu Leu Gly Arg val Lys Asn 
275 280 285 

Asn Gly Pro Phe Pro Met Phe Thr ser cys Cys Pro Ala Trp val Arg 
290 295 300 

Leu Ala Gin Asn Tyr His Pro Glu Leu Leu Asp Asn Leu Ser ser Ala 
305 310 315 320 

Lys Ser Pro Gin Gin lie Phe Gly Thr Ala ser Lys Thr Tyr Tyr Pro 
325 330 335 

Ser lie Ser Gly lie Ala Pro Glu Asp Val Tyr Thr Val Thr lie Met 
340 345 350 

Pro Cys Asn Asp Lys Lys Tyr Glu Ala Asp lie Pro Phe Met Glu Thr 
355 360 365 

Asn ser Leu Arg Asp lie Asp Ala Ser Leu Thr Thr Arg Glu Leu Ala 
370 375 380 

Lys Met lie Lys Asp Ala Lys lie Lys Phe Ala Asp Leu Glu Asp Gly 
385 390 395 400 

Glu val Asp Pro Ala Met Gly Thr Tyr Ser Gly Ala Gly Ala lie Phe 
405 410 415 

Gly Ala Thr Gly Gly Val Met Glu Ala Ala lie Arg Ser Ala Lys Asp 
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425 430 

Phe Ala Glu Asn Lys Glu Leu Glu Asn Val Asp Tyr Thr Glu Val Arg 
435 440 445 

Gly Phe Lys Gly He Lys Glu Ala Glu Val Glu lie Ala Gly Asn Lys 
450 455 460 

Leu Asn val Ala Val lie Asn Gly Ala Ser Asn Phe Phe Glu Phe Met 
465 470 475 480 

Lys ser Gly Lys Met Asn Glu Lys Gin Tyr His Phe lie Glu Val Met 
485 490 495 

Ala cys Pro Gly Gly Cys lie Asn Gly Gly Gly Gin Pro His Val Asn 
500 505 510 

Ala Leu Asp Arg Glu Asn val Asp Tyr Arg Lys Leu Arg Ala ser Val 
515 520 525 

Leu Tyr Asn Gin Asp Lys Asn Val Leu Ser Lys Arg Lys ser His Asp 
530 535 540 

Asn Pro Ala lie lie Lys Met Tyr Asp Ser Tyr Phe Gly Lys Pro Gly 
545 550 555 560 

Glu Gly Leu Ala His Lys Leu Leu His Val Lys Tyr Thr Lys Asp Lys 
565 570 575 

Asn val ser Lys His Glu 
580 

<210> 102 
<211> 574 
<212> PRT 

<213> Clostridium pasteurianum 
<400> 102 

Met Lys Thr lie lie lie Asn Gly val Gin Phe Asn Thr Asp Glu Asp 
1 5 10 15 

Thr Thr lie Leu Lys Phe Ala Arg Asp Asn Asn lie Asp lie Ser Ala 
20 25 30 

Leu cys Phe Leu Asn Asn Cys Asn Asn Asp lie Asn Lys Cys Glu lie 
35 40 45 

Cys Thr val Glu val Glu Gly Thr Gly Leu Val Thr Ala Cys Asp Thr 
50 55 60 

Leu lie Glu Asp Gly Met lie lie Asn Thr Asn ser Asp Ala Val Asn 
65 70 75 80 

Glu Lys lie Lys Ser Arg lie Ser Gin Leu Leu Asp lie His Glu Phe 
85 90 95 

Lys Cys Gly Pro cys Asn Arg Arg Glu Asn Cys Glu Phe Leu Lys Leu 
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val lie Lys Tyr Lys Ala Arg Ala Ser Lys Pro Phe Leu Pro Lys Asp 
115 120 125 

Lys Thr Glu Tyr val Asp Glu Arg ser Lys Ser Leu Thr val Asp Arg 
130 135 ~ 140 

Thr Lys cys Leu Leu Cys Gly Arg cys val Asn Ala cys Gly Lys Asn 
145 150 ~ 155 160 

Thr Glu Thr Tyr Ala Met Lys Phe Leu Asn Lys Asn Gly Lys Thr lie 
165 170 175 

lie Gly Ala Glu Asp Glu Lys Cys Phe Asp Asp Thr Asn Cys Leu Leu 
180 185 190 

Cys Gly Gin Cys lie lie Ala Cys Pro val Ala Ala Leu ser Glu Lys 
195 200 205 

Ser His Met Asp Arg Val Lys Asn Ala Leu Asn Ala Pro Glu Lys His 
210 215 220 

val lie val Ala Met Ala Pro ser Val Arg Ala Ser lie Gly Glu Leu 
225 230 235 240 

Phe Asn Met Gly Phe Gly Val Asp Val Thr Gly Lys lie Tyr Thr Ala 
245 250 255 

Leu Arg Gin Leu Gly Phe Asp Lys lie Phe Asp lie Asn Phe Gly Ala 
260 265 270 

Asp Met Thr He Met Glu Glu Ala Thr Glu Leu Val Gin Arg lie Glu 
275 280 285 

Asn Asn Gly Pro Phe Pro Met Phe Thr Ser cys Cys Pro Gly Trp val 
290 295 300 

Arg Gin Ala Glu Asn Tyr Tyr Pro Glu Leu Leu Asn Asn Leu Ser ser 
305 310 315 320 

Ala Lys Ser Pro Gin Gin lie Phe Gly Thr Ala Ser Lys Thr Tyr Tyr 
325 330 335 

Pro ser lie ser Gly Leu Asp Pro Lys Asn val Phe Thr Val Thr Val 
340 345 350 

Met Pro Cys Thr ser Lys Lys Phe Glu Ala Asp Arg Pro Gin Met Glu 
355 360 365 

Lys Asp Gly Leu Arg Asp lie Asp Ala Val lie Thr Thr Arg Glu Leu 
370 ~ 375 380 

Ala Lys Met lie Lys Asp Ala Lys lie Pro Phe Ala Lys Leu Glu Asp 
385 390 395 400 
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ioniser cffcty iqTBIS&P* &I. W& jfet Gly Glu Tyr Ser Gly Ala Gly Ala lie 
" " " " 405 410 415 

Phe Gly Ala Thr Gly Gly Val Met Glu Ala Ala Leu Arg ser Ala Lys 
420 425 430 

Asp Phe Ala Glu Asn Ala Glu Leu Glu Asp lie Glu Tyr Lys Gin Val 
435 440 445 

Arg Gly Leu Asn Gly lie Lys Glu Ala Glu val Glu lie Asn Asn Asn 
450 455 460 

Lys Tyr Asn Val Ala Val lie Asn Gly Ala Ser Asn Leu Phe Lys Phe 
465 470 475 480 

Met Lys Ser Gly Met lie Asn Glu Lys Gin Tyr His Phe lie Glu Val 
485 490 495 

Met Ala Cys His Gly Gly Cys Val Asn Gly Gly Gly Gin Pro His Val 
500 505 510 

Asn Pro Lys Asp Leu Glu Lys val Asp lie Lys Lys Val Arg Ala Ser 
515 520 525 

Val Leu Tyr Asn Gin Asp Glu His Leu Ser Lys Arg Lys Ser His Glu 
530 535 540 

Asn Thr Ala Leu val Lys Met Tyr Gin Asn Tyr Phe Gly Lys Pro Gly 
545 550 555 560 

Glu Gly Arg Ala His Glu lie Leu His Phe Lys Tyr Lys Lys 
565 570 

<210> 103 
<211> 421 
<212> PRT 

<213> Desulfovibrio vulgaris 
<40O> 103 

Met Ser Arg Thr Val Met Glu Arg lie Glu Tyr Glu Met His Thr Pro 
15 ~ 10 15 

Asp Pro Lys Ala Asp Pro Asp Lys Leu His Phe val Gin lie Asp Glu 
20 25 30 

Ala Lys cys lie Gly cys Asp Thr cys ser Gin Tyr cys Pro Thr Ala 
35 40 45 

Ala lie Phe Gly Glu Met Gly Glu Pro His ser lie Pro His lie Glu 
50 55 60 

Ala Cys lie Asn cys Gly Gin Cys Leu Thr His Cys Pro Glu Asn Ala 
65 70 75 80 

lie Tyr Glu Ala Gin ser Trp Val Pro Glu Val Glu Lys Lys Leu Lys 
85 90 95 
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p ^spjg\^ SaJ hJyft. ^yisl! .,33?] e Ala Met Pro Ala Pro Ala Val Arg Tyr 

Ala Leu Gly Asp Ala Phe Gly Met Pro Val Gly Ser Val Thr Thr Gly 
115 120 125 

Lys Met Leu Ala Ala Leu Gin Lys Leu Gly phe Ala His cys Trp Asp 
130 135 140 

Thr Glu Phe Thr Ala Asp Val Thr lie Trp Glu Glu Gly ser Glu Phe 
145 150 155 160 

Val Glu Arg Leu Thr Lys Lys Ser Asp Met Pro Leu Pro Gin Phe Thr 
165 170 175 

ser cys Cys Pro Gly Trp Gin Lys Tyr Ala Glu Thr Tyr Tyr Pro Glu 
180 185 190 

Leu Leu Pro His Phe Ser Thr Cys Lys Ser Pro He Gly Met Asn Gly 
195 200 205 

Ala Leu Ala Lys Thr Tyr Gly Ala Glu Arg Met Lys Tyr Asp Pro Lys 
210 215 220 

Gin Val Tyr Thr Val Ser lie Met Pro Cys lie Ala Lys Lys Tyr Glu 
225 230 235 240 

Gly Leu Arg Pro Glu Leu Lys Ser ser Gly Met Arg Asp lie Asp Ala 
245 250 255 

Thr Leu Thr Thr Arg Glu Leu Ala Tyr Met lie Lys Lys Ala Gly lie 
260 ~ 265 270 

Asp Phe Ala Lys Leu Pro Asp Gly Lys Arg Asp Ser Leu Met Gly Glu 
275 280 285 

ser Thr Gly Gly Ala Thr lie Phe Gly val Thr Gly Gly val Met Glu 
290 295 300 

Ala Ala Leu Arg Phe Ala Tyr Glu Ala val Thr Gly Lys Lys Pro Asp 
305 310 315 * 320 

Ser Trp Asp Phe Lys Ala Val Arg Gly Leu Asp Gly lie Lys Glu Ala 
325 330 335 

Thr val Asn Val Gly Gly Thr Asp Val Lys Val Ala Val Val His Gly 
340 345 350 

Ala Lys Arg Phe Lys Gin Val Cys Asp Asp Val Lys Ala Gly Lys Ser 
355 ' 360 365 

Pro Tyr His Phe lie Glu Tyr Met Ala cys Pro Gly Gly Cys Val Cys 
370 375 380 

Gly Gly Gly Gin Pro Val Met Pro Gly Val Leu Glu Ala Met Asp Arg 
385 390 " 395 400 
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Thr Thr Thr Arg Leu Tyr Ala Gly Leu Lys Lys Arg Leu Ala Met Ala 
405 410 415 

Ser Ala Asn Lys Ala 
420 

<210> 104 
<211> 449 
<212> PRT 

<213> Trichomonas vaginalis 
<400> 104 

Met Leu Ala ser Ser Ser Arg Ala Ala Ala Asn lie Arg Trp Val Asp 
1 5 10 15 

Thr Ser His Asn Ala lie Ala Phe Asp Met His Lys cys lie Asn Cys 
20 25 30 

Gin Ala cys val Arg Ala Cys Lys Asn Val Ala Gly Gin Ser Val Leu 
35 40 45 

Lys ser Val Lys lie Asn Glu Gly Lys Lys Lys Gly val val Gin Thr 
50 55 60 

val Thr Gly Lys Leu Leu Ala Glu Thr Asn cys lie Gly cys Gly Gin 
65 70 75 80 

Cys Thr Leu val Cys Pro Thr Gin Ala He His Glu Lys Asp Ala Leu 
85 90 95 

Lys Gin Met Asn Asn lie Phe Lys Asn Lys Gly Asp Arg lie Leu val 
100 105 110 

Cys Gin lie Ala Pro Ala lie Arg lie Asn Met Arg Arg Pro Trp Cys 
115 120 125 

ser ser Arg Asn ser Phe His Arg Gin Ser Arg Tyr Ser Pro Gin Arg 
130 135 140 

Leu Gly Phe Asp Tyr val Phe Asp Thr Asn Phe Gly Ala Asp Leu Thr 
145 150 155 160 

lie Val Glu Glu Ala Thr Glu Leu Leu Gin Arg Leu Asn Asp Pro Lys 
165 170 175 

Ala val Leu Pro Met Phe Thr Ser Cys Cys Pro Ala Trp Val Asn Tyr 
180 185 190 

Val Glu Lys Ser Tyr Pro Gin Trp Met Pro His Leu Ser Thr Cys Arg 
195 200 205 

Ser Pro lie Gly Met Leu ser Ala Val lie Lys Asn val Phe Pro Lys 
210 215 220 

His lie Gly val Asp Pro Lys Arg lie Phe ser val Gly lie Met Pro 
225 230 235 " 240 
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:ys Thr Ala Lys Lys Asp Glu Ala Ala Arg Glu Gin Leu Met Thr Lys 
245 250 255 

Ser Gly Leu His Glu Thr Asp Leu Asp lie Thr Ser Arg Glu Leu Ala 
260 265 270 

Lys Met lie Lys Ala Ala Lys lie Asn Phe Lys Glu Leu Pro Asp Thr 
275 280 285 

Glu Leu Asp ser Pro Tyr Ala Met Ala Thr Gly Gly Gly Ala lie Phe 
290 295 300 

cys Ala Thr Gly Gly Val Met Glu Ala Ala Val Arg Ser Ala Tyr Lys 
305 310 315 ~ 320 

Phe Ala Thr Gly Lys Glu Leu Ala Pro lie Glu Phe val Gin val Arg 

325 330 335 

Gly Ala Glu Lys Gly lie Lys Val Gly Thr Val Asp lie Asn Gly Arg 
340 345 350 

Glu lie Lys Val Ala Val Ala Gin Gly val Lys Asn Ala Met Ser Leu 
355 360 365 

lie Lys Lys lie Glu Glu Gly Gin Asp Asp Val Lys Gly Val val Phe 
370 375 380 

cys Glu val Met Ala cys Pro Gly Gly cys val Gly Gly Gly Gly ser 
385 390 395 400 

Pro Arg Ala Lys Thr Lys Ala Ala Met Asn Lys Arg Leu Asp Ala Thr 
405 410 ~ 415 

Tyr Arg lie Asp Arg Ala Ser Lys Tyr Arg Thr Pro Gin Asp Asn Thr 
420 ~ 425 ~ 430 

Gin Leu Gin Asp Leu Tyr Asn Ala Thr Trp Val Val Ser Leu Val Met 
435 440 445 

Asp 



<210> 105 

<211> 645 

<212> PRT 

<213> T. maritima 

<400> 105 

Met Lys lie Tyr Val Asp Gly Arg Glu val lie lie Asn Asp Asn Glu 
15 10 15 

Arg Asn Leu Leu Glu Ala Leu Lys Asn val Gly lie Glu lie Pro Asn 
20 25 30 

Leu Cys Tyr Leu Ser Glu Ala Ser lie Tyr Gly Ala Cys Arg Met cys 
35 40 45 
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Leu vai Giu lie Asn Gly Gin lie Thr Thr Ser Cys Thr Leu Lys Pro 
50 55 60 

Tyr Glu Gly Met Lys Val Lys Thr Asn Thr Pro Giu lie Tyr Glu Met 
65 70 75 80 

Arg Arg Asn lie Leu Glu Leu lie Leu Ala Thr His Asn Arg Asp cys 
85 90 95 

Thr Thr Cys Asp Arg Asn Gly Ser Cys Lys Leu Gin Lys Tyr Ala Glu 
100 " 105 110 

Asp Phe Gly lie Arg Lys lie Arg Phe Glu Ala Leu Lys Lys Glu His 
115 ~ 120 125 

val Arg Asp Glu Ser Ala Pro Val val Arg Asp Thr ser Lys Cys lie 
130 135 140 

Leu cys Gly Asp Cys Val Arg Val cys Glu Glu lie Gin Gly val Gly 
145 150 155 160 

val lie Glu Phe Ala Lys Arg Gly Phe Glu Ser Val Val Thr Thr Ala 
165 ~ 170 175 

Phe Asp Thr Pro Leu He Glu Thr Glu Cys val Leu cys Gly Gin cys 
180 185 190 

Val Ala Tyr cys Pro Thr Gly Ala Leu ser lie Arg Asn Asp lie Asp 
195 200 205 

Lys Leu lie Glu Ala Leu Glu ser Asp Lys lie Val lie Gly Met lie 
210 215 220 

Ala Pro Ala Val Arg Ala Ala lie Gin Glu Glu Phe Gly lie Asp Glu 
225 230 235 240 

Asp Val Ala Met Ala Glu Lys Leu Val ser Phe Leu Lys Thr lie Gly 
245 250 255 

Phe Asp Lys Val Phe Asp Val Ser Phe Gly Ala Asp Leu Val Ala Tyr 
260 265 270 

Glu Glu Ala His Glu Phe Tyr Glu Arg Leu Lys Lys Gly Glu Arg Leu 
275 280 285 

Pro Gin Phe Thr ser Cys Cys Pro Ala Trp Val Lys His Ala Glu His 
290 295 300 

Thr Tyr Pro Gin Tyr Leu Gin Asn Leu Ser ser Val Lys ser pro Gin 
305 310 315 320 

Gin Ala Leu Gly Thr Val lie Lys Lys lie Tyr Ala Arg Lys Leu Gly 
325 330 335 

Val Pro Glu Glu Lys lie Phe Leu val Ser Phe Met Pro Cys Thr Ala 
340 345 350 
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Lys Lys Phe Glu Ala Glu Arg Glu Glu His Glu Gly lie val Asp lie 
355 360 365 

Val Leu Thr Thr Arg Glu Leu Ala Gin Leu lie Lys Met Ser Arg lie 
370 375 380 

Asp lie Asn Arg Val Glu Pro Gin Pro Phe Asp Arg Pro Tyr Gly Val 
385 390 395 400 

Ser Ser Gin Ala Gly Leu Gly Phe Gly Lys Ala Gly Gly Val Phe Ser 
405 410 415 

Cys val Leu Ser Val Leu Asn Glu Glu lie Gly lie Glu Lys val Asp 
420 425 430 

Val Lys ser Pro Glu Asp Gly lie Arg Val Ala Glu val Thr Leu Lys 
435 440 ~ 445 

Asp Gly Thr ser Phe Lys Gly Ala val lie Tyr Gly Leu Gly Lys val 
450 455 460 

Lys Lys Phe Leu Glu Glu Arg Lys Asp val Glu lie lie Glu val Met 
465 470 ~ 475 480 

Ala cys Asn Tyr Gly cys Val Gly Gly Gly Gly Gin Pro Tyr Pro Asn 
485 490 495 

Asp ser Arg lie Arg Glu His Arg Ala Lys Val Leu Arg Asp Thr Met 
500 505 510 

Gly lie Lys Ser Leu Leu Thr Pro Val Glu Asn Leu Phe Leu Met Lys 
515 520 525 

Leu Tyr Glu Glu Asp Leu Lys Asp Glu His Thr Arg His Glu lie Leu 
530 535 540 

His Thr Thr Tyr Arg Pro Arg Arg Arg Tyr Pro Glu Lys Asp Val Glu 
545 550 ~ ~ 555 560 

lie Leu Pro Val Pro Asn Gly Glu Lys Arg Thr val Lys Val cys Leu 
565 570 575 

Gly Thr Ser Cys Tyr Thr Lys Gly Ser Tyr Glu lie Leu Lys Lys Leu 
580 585 590 

val Asp Tyr val Lys Glu Asn Asp Met Glu Gly Lys lie Glu val Leu 
595 600 605 

Gly Thr Phe Cys val Glu Asn cys Gly Ala ser Pro Asn val lie val 
610 615 620 

Asp Asp Lys lie lie Gly Gly Ala Thr Phe Glu Lys val Leu Glu Glu 
625 630 635 640 

Leu ser Lys Asn Gly 
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<210> 106 

<211> 369 

<212> PRT 

<213> T vaginalis 

<400> 106 

cys Asp Gly Lys Trp Leu Ala Pro Ala cys Val Thr Thr val Trp Asp 
1 5 10 15 

Gly Leu Lys lie Asp Thr Lys Ser Lys Met Val Lys Glu ser Val Glu 
20 25 30 

Asn Asn Leu Lys Glu Leu Leu Asp cys His Asp Glu Thr cys ser ser 
35 40 45 

Cys val Ala Asn His Arg cys Gin phe Arg Asp Met Asn Val Ala Tyr 
50 55 60 

Ser lie Lys Ala Glu Thr Lys Glu Glu cys Ser Glu Glu Gly lie Asp 
65 70 75 80 

Glu Ser Thr Asn ser lie Arg Leu Asp "Thr ser Lys Cys val Leu Cys 
85 90 95 

Gly Arg cys lie Arg Ala Cys Glu Glu val Ala Gly Gin ser Ala lie 
100 105 110 

lie Phe Gly Asn Arg Ala Lys His Met Arg lie Gin Pro Thr Phe Gly 
115 120 125 

Gin Thr Leu Gin Asp Thr Ser cys lie Lys Cys Gly Gin cys Thr Leu 
130 135 140 

Tyr cys Pro Val Gly Ala lie Thr Glu Lys Ser Gin val Lys Gin Ala 
145 150 155 160 

Leu Asp lie Leu Ser Asn Lys Gly Lys Lys lie Ser Val lie Gin Val 
165 170 175 

Ala Pro Ala val Arg Val Ala Leu ser Glu Ala Phe Gly Tyr Lys Glu 
180 ~ 185 190 

Gly ser val Thr Thr Gly Lys Met val Ser Ala Leu Lys Ala Leu Gly 
195 200 205 

Phe Asp Tyr Val Tyr Asp Thr Asn Tyr Ser Ala Asp Leu Thr lie val 
210 215 220 

Glu Glu Ala Gly Glu Leu val Gin Arg Leu Lys Asn Pro Asn Ala Val 
225 230 235 240 

Phe Pro Met Phe Thr Ser Cys Cys Pro Ala Trp Val Asn Tyr Val Glu 
245 ' 250 255 

Gin ser Ala Pro Asp Phe lie Pro Asn Leu ser ser cys Arg ser Pro 
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!;260l 265 270 

Gin Gly Met Leu ser Ser Leu val Lys Asn Tyr Leu Pro Lys val Leu 
275 280 285 

Asn lie Pro Val Glu Asp Val Leu Asn Phe ser lie Met Pro Cys Thr 
290 295 300 

Ala Lys Lys Asp Glu lie Glu Arg Pro Glu Leu Arg Thr Lys Asp Gly 
305 310 315 320 

His Lys Glu Thr Asp Met val Leu Thr Val Arg Glu Leu Val Glu Met 
325 330 335 

lie Lys Leu ser Gly lie Asp Phe Asn Asn Leu Pro Asp Thr Pro Phe 
340 345 350 

Asp Ser lie Phe Gly Phe Gly ser Gly Ala Gly Gin lie Phe Ala Ala 
355 360 365 

Thr 



<210> 107 
<211> 476 
<212> PRT 

<213> R. norvegicus 
<400> 107 

Met Ala Ser Pro Phe Ser Gly Ala Leu Gin Leu Thr Asp Leu Asp Asp 
15 10 15 

Phe lie Gly Pro Ser Gin Ser Cys lie Lys Pro Val Thr Val Ala Lys 
20 25 30 

Lys Pro Gly ser Gly lie Ala Lys lie His lie Glu Asp Asp Gly ser 
35 40 45 

Tyr Phe Gin val Asn Pro Asp Gly Arg ser Gin Lys Leu Glu Lys Ala 
50 55 60 

Lys Val ser Leu Asn Asp Cys Leu Ala Cys ser Gly cys Val Thr ser 
65 70 75 80 

Ala Glu Thr lie Leu lie Thr Gin Gin Ser His Glu Glu Leu Arg Lys 
85 90 95 

Val Leu Asp Ala Asn Lys Val Ala Ala Pro Gly Gin Gin Arg Leu Val 
100 105 - 110 

val val Ser Val ser Pro Gin ser Arg Ala Ser Leu Ala Ala Arg Phe 
115 120 125 

Gin Leu Asp Ser Thr Asp Thr Ala Arg Lys Leu Thr ser Phe Phe Lys 
130 135 ~ 140 

Lys lie Gly Val His Phe Val Phe Asp Thr Ala Phe Ala Arg Asn Phe 
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PBFT, feW- 155 y 160 

ser Leu Leu Glu ser Gin Lys Glu Phe Val Gin Arg Phe Arg Glu Gin 
165 170 ~ 175 

Ala Asn ser Arg Glu Ala Leu Pro Met Leu Ala ser Ala cys Pro Gly 
180 185 190 

Trp lie cys Tyr Ala Glu Lys Thr His Gly Asn Phe lie Leu Pro Tyr 
195 200 205 

lie Ser Thr Ala Arg ser Pro Gin Gin Val Met Gly Ser Leu lie Lys 
210 215 220 

Asp Phe Phe Ala Gin Gin Gin Leu Leu Thr pro Asp Lys lie Tyr His 
225 230 235 240 

val Thr val Met Pro Cys Tyr Asp Lys Lys Leu Glu Ala ser Arg Pro 
245 250 255 

Asp Phe Phe Asn Gin Glu Tyr Gin Thr Arg Asp Val Asp cys val Leu 
260 265 270 

Thr Thr Gly Glu val Phe Arg Leu Leu Glu Glu Glu Gly val ser Leu 
275 280 285 

Ser Glu Leu Glu Pro val Pro Leu Asp Gly Leu Thr Arg Ser val ser 
290 295 300 

Ala Glu Glu Pro Thr ser His Arg Gly Gly Gly ser Gly Gly Tyr Leu 
305 310 315 320 

Glu His val Phe Arg His Ala Ala Gin Glu Leu Phe Gly lie His val 
325 330 335 

Ala Asp Val Thr Tyr Gin Pro Met Arg Asn Lys Asp Phe Gin Glu val 
340 345 350 

Thr Leu Glu Arg Glu Gly Gin Val Leu Leu Arg Phe Ala Val Ala Tyr 
355 360 365 

Gly Phe Arg Asn lie Gin Asn Leu Val Gin Lys Leu Lys Arg Gly Arg 
370 375 380 

Cys Pro Tyr His Tyr Val Glu Val Met Ala cys Pro ser Gly cys Leu 
385 390 395 400 

Asn Gly Gly Gly Gin Leu Lys Ala Pro Asp Thr Glu Gly Arg Glu Leu 
405 410 415 

Leu Gin Gin Val Glu Arg Leu Tyr ser Met Val Arg Thr Glu Ala Pro 
420 425 ~ 430 

Glu Asp Ala Pro Gly Val Gin Glu Leu Tyr Gin His Trp Leu Gin Gly 
435 440 445 
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E? Q3lP,/^E$»S6l'u ! Ur% Vltt4er His Leu Leu His Thr Gin Tyr His Ala 
450 455 460 

Val Glu Lys lie Asn Ser Gly Leu ser lie Arg Trp 
465 470 475 

<210> 108 

<211> 525 

<212> PRT 

<213> s. cerevisiae 

<400> 108 

Met Ala ser Pro Phe Ser Gly Ala Leu Gin Leu Thr Asp Leu Asp Asp 
1 5 10 15 

Phe lie Gly Pro Ser Gin Val Gly ser Leu Gin Ala Leu Leu Ala Leu 
20 25 30 

Ala Phe Leu His Thr Gly Asn Phe Ser Ala Ala Gly Cys Trp Glu Pro 
35 40 45 

Asp Pro Trp Glu Cys lie Lys Pro Val Lys val Glu Lys Arg Ala Gly 
50 55 60 

Ser Gly Val Ala Lys lie Arg lie Glu Asp Asp Gly Ser Tyr Phe Gin 
65 70 7 5 80 

lie Asn Gin Glu Lys Leu Gly Glu Leu Glu Leu Glu Pro Thr Phe Gly 
85 90 95 

lie Phe Leu Pro Tyr Ser Pro Asp Gly Gly Thr Arg Arg Leu Glu Lys 
100 105 110 

Ala Lys val ser Leu Asn Asp Cys Leu Ala cys Ser Gly Cys lie Thr 
115 120 125 

Ser Ala Glu Thr Val Leu lie Thr Gin Gin Ser His Glu Glu Leu Lys 
130 135 140 

Lys Val Leu Asp Ala Asn Lys Met Ala Ala Pro Ser Gin Gin Arg Leu 
145 150 155 160 

Val val val ser val ser Pro Gin ser Arg Ala ser Leu Ala Ala Arg 
165 170 175 

Phe Gin Leu Asn Pro Thr Asp Thr Ala Arg Lys Leu Thr ser Phe Phe 
180 185 " 190 

Lys Lys lie Gly Val His Phe Val Phe Asp Thr Ala Phe Ser Arg His 
195 200 205 

Phe Ser Leu Leu Glu Ser Gin Arg Glu Phe Val Arg Arg Phe Arg Gly 
210 215 220 

Gin Ala Asp Cys Arg Gin Ala Leu Pro Leu Leu Ala ser Ala cys Pro 
225 230 235 240 
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W-Glu Lys Thr His Gly ser Phe lie Leu Pro 
250 255 



His lie ser Thr Ala Arg ser Pro Gin Gin val Met Gly Ser Leu val 
260 265 270 

Lys Asp Phe Phe Ala Gin Gin Gin His Leu Thr Pro Asp Lys lie Tyr 
275 280 285 

His val Thr val Met Pro cys Tyr Asp Lys Lys Leu Glu Ala Ser Arg 
290 295 300 

Pro Asp Phe Phe Asn Gin Glu His Gin Thr Arg Asp Val Asp Cys val 
305 310 315 320 

Leu Thr Thr Gly Glu val Phe Arg Leu Leu Glu Glu Glu Gly Val Ser 
325 330 335 

Leu Pro Asp Leu Glu Pro Ala Pro Leu Asp Ser Leu Cys Ser Gly Ala 
340 345 350 

Ser Ala Glu Glu Pro Thr ser His Arg Gly Gly Gly ser Gly Gly Tyr 
355 360 ~ 365 

Leu Glu His Val Phe Arg His Ala Ala Arg Glu Leu Phe Gly lie His 
370 375 380 

val Ala Glu val Thr Tyr Lys Pro Leu Arg Asn Lys Asp Phe Gin Glu 
385 390 395 400 

val Thr Leu Glu Lys Glu Gly Gin Val Leu Leu His Phe Ala Met Ala 
405 410 415 

Tyr Gly Phe Arg Asn lie Gin Asn Leu Val Gin Arg Leu Lys Arg Gly 
420 425 430 

Arg Cys Pro Tyr His Tyr Val Glu Val Met Ala Cys Pro Ser Gly Cys 
435 440 445 

Leu Asn Gly Gly Gly Gin Leu Gin Ala Pro Asp Arg Pro ser Arg Glu 
450 455 460 

Leu Leu Gin His Val Glu Arg Leu Tyr Gly Met Val Arg Ala Glu Ala 
465 470 475 480 

Pro Glu Asp Ala Pro Gly val Gin Glu Leu Tyr Thr His Trp Leu Gin 
485 490 495 

Gly Thr Asp Ser Glu cys Ala Gly Arg Leu Leu His Thr Gin Tyr His 
500 505 510 

Ala val Glu Lys Ala Ser Thr Gly Leu Gly lie Arg Trp 
515 520 525 



<210> 109 
<211> 572 
<212> PRT 
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<400> 109 

Met Asn Lys He lie lie Asn Asp Lys Thr lie Glu Phe Asp Gly Asp 
15 10 15 

Lys Thr lie Leu Asp Leu Ala Arg Glu Asn Gly Phe Asp lie Pro val 
20 25 30 

Leu cys Glu Leu Lys Asn cys Gly Asn Lys Gly Gin Cys Gly val Cys 
35 40 45 

Leu val Glu Gin Glu Gly Asn Asp Arg Leu Leu Arg ser cys Ala lie 
50 55 60 

Lys Ala Lys Asp Gly Met Val lie Lys Thr Asp ser Glu Lys val Leu 
65 70 75 80 

Glu Ala Arg Lys Glu Arg Val Ala Glu Leu Leu Asp Glu His Glu Phe 
85 90 95 

Lys cys Gly Pro cys Lys Arg Arg Glu Asn Cys Glu Phe Leu Lys Leu 
100 105 110 

Val lie Lys Thr Lys Ala Arg Ala His Lys Pro Phe val val Ala Asp 
115 120 125 

Lys Ser Glu Tyr Val Asp Asp Arg ser Lys ser lie Val Leu Asp Arg 
130 135 140 

Ser Lys Cys Val Lys Cys Gly Arg Cys Val Ala Ala cys Arg Thr Arg 
145 150 155 160 

Thr Ala Thr Asn ser lie Lys Phe His Arg lie Asp Gly val Arg Leu 
165 170 175 

Val Gly Pro Glu Glu Leu Lys Cys Phe Asp Asp Thr Asn Cys Leu Leu 
180 185 190 

Cys Gly Gin Cys lie Ala Ala Cys Pro Val Asp Ala Leu ser Glu Lys 
195 200 205 

Ser His lie Glu Arg Val Gin Asp Ala Leu Asn Asp Pro Glu Lys His 
210 215 220 

Val lie Val Ala Met Ala Pro Ala val Arg Thr ser Met Gly Glu Leu 
225 230 235 240 

Phe Lys Met Gly Tyr Gly Gin Asp Val Thr Gly Lys Leu Tyr Thr Ala 
245 250 255 

Leu Arg Glu Leu Gly Phe Asp Lys Val Phe Asp lie Asn Phe Gly Ala 
260 265 270 

Asp Met Thr lie Met Glu Glu Ala Thr Glu Leu lie Glu Arg lie Lys 
275 280 285 
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Asn Asn Gly Pro Phe Pro Met Leu Thr ser cys Cys Pro ser Trp val 
290 295 300 



Arg Glu val G"lu Asn Tyr Phe Pro Glu Leu Val Glu Asn Leu Ser Ser 
305 310 315 320 



Ala Lys ser Pro Gin Gin lie Phe Gly Ala Ala Ser Lys Thr Tyr Tyr 
325 330 335 



Pro Gin Val Ala Asp lie Asp pro Lys Lys Val Phe Thr Val Thr Val 
340 345 350 



Met Pro cys Thr ser Lys Lys Phe Glu Ala Asp Arg Pro Glu Met Glu 
355 360 365 



Asn Glu Gly lie Arg Asn lie Asp Ala Val lie Thr Thr Arg Glu Leu 
370 ~ 375 380 



Ala Arg Met lie Lys Ala Ala Lys lie Asp Phe Ala Lys Leu Glu Asp 
385 390 " 395 " 400 



Gly Glu val Asp Pro Ala Met Gly Glu Tyr Thr Gly Ala Gly val lie 
405 410 " 415 



Phe Gly Ala Thr Gly Gly Val Met Glu Ala Ala Leu Arg Thr Ala Lys 
420 425 430 



Asp Phe Met Glu Asn Asp Asn Leu Asp Asn val Asp Tyr Glu Ala val 
435 440 445 



Arg Gly Leu Ala Gly lie Lys Glu Ala Glu Val Glu lie Ala Gly Asn 
450 455 460 



Glu Tyr Lys Leu Ala val Val Ser Gly Ala Ala Asn val Phe Glu Leu 
465 470 475 480 



Val Lys Ser Gly Lys lie Asn Asp Tyr His Phe lie Glu val Met Ala 
485 490 495 



Cys Pro Gly Gly Cys Val Asn Gly Gly Gly Gin Pro His lie ser Ala 
500 505 510 



Glu Asp ser Asp Lys lie Asp lie Arg Glu val Arg Ala ser val Leu 
515 520 525 



Tyr Asn Gin Asp Lys Asn Leu Glu Lys Arg Lys Ser His Gin Asn Ser 
530 535 540 



Ala Leu Leu Lys Met Tyr Glu Asn Tyr Met Gly Lys Pro Gly His Gly 
545 550 555 560 



Arg Ala His Glu Leu Leu His Met Lys Tyr Lys Lys 



565 



570 



<210> 110 
<211> 572 
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<213> c. perfringens 
<400> 110 

Met Asn Lys lie lie lie Asn Asp Lys Thr lie Glu Phe Asp Gly Asp 
15 10 15 

Lys Thr lie Leu Asp Leu Ala Arg Glu Asn Gly Phe Asp lie Pro Val 
20 25 30 

Leu Cys Glu Leu Lys Asn cys Gly Asn Lys Gly Gin Cys Gly Val cys 
35 40 45 

Leu Val Glu Gin Glu Gly Asn Asp Arg Leu Leu Arg ser cys Ala lie 
50 55 60 

Lys Ala Lys Asp Gly Met Val lie Lys Thr Asp Ser Glu Lys Val Leu 
65 70 75 80 

Glu Ala Arg Lys Glu Arg val Ala Glu Leu Leu Asp Glu His Glu Phe 
85 90 95 

Lys cys Gly Pro Cys Lys Arg Arg Glu Asn cys Glu Phe Leu Lys Leu 
100 105 110 

Val lie Lys Thr Lys Ala Arg Ala His Lys Pro Phe Val val Ala Asp 
115 120 125 

Lys ser Glu Tyr val Asp Asp Arg Ser Lys Ser lie Val Leu Asp Arg 
130 135 140 

ser Lys cys val Lys Cys Gly Arg Cys val Ala Ala cys Arg Thr Arg 
145 150 155 160 

Thr Ala Thr Asn ser lie Lys Phe His Arg lie Asp Gly Val Arg Leu 
165 170 175 

val Gly Pro Glu Glu Leu Lys Cys Phe Asp Asp Thr Asn Cys Leu Leu 
180 185 190 

Cys Gly Gin Cys lie Ala Ala Cys Pro Val Asp Ala Leu ser Glu Lys 
195 200 205 

ser His lie Glu Arg Val Gin Glu Ala Leu Asn Asp Pro Glu Lys His 
210 215 220 

val lie val Ala Met Ala Pro Ala Val Arg Thr Ser Met Gly Glu Leu 
225 230 235 240 

Phe Lys Met Gly Tyr Gly Gin Asp val Thr Gly Lys Leu Tyr Thr Ala 
245 250 255 

Leu Arg Glu Leu Gly Phe Asp Lys Val Phe Asp lie Asn Phe Gly Ala 
260 265 270 

Asp Met Thr lie Met Glu Glu Ala Thr Glu Leu lie Glu Arg lie Lys 
275 280 285 
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Asn Asn Gly pro Phe Pro Met Leu Thr ser Cys Cys pro ser Trp val 
290 295 300 

Arg Glu Val Glu Asn Tyr Phe Pro Glu Leu Val Glu Asn Leu Ser Ser 
305 310 315 320 

Ala Lys Ser pro Gin Gin lie Phe Gly Ala Ala Ser Lys Thr Tyr Tyr 
325 330 335 

Pro Gin val Ala Asp lie Asp Pro Lys Lys Val Phe Thr Val Thr Val 
340 345 350 

Met Pro Cys Thr Ser Lys Lys Phe Glu Ala Asp Arg Pro Glu Met Glu 
355 360 365 

Asn Glu Gly lie Arg Asn lie Asp Ala Val lie Thr Thr Arg Glu Leu 
370 375 380 

Ala Arg Met lie Lys Ala Ala Lys lie Asp Phe Ala Lys Leu Glu Asp 
385 390 395 400 

Gly Glu val Asp Pro Ala Met Gly Glu Tyr Thr Gly Ala Gly val lie 
405 410 415 

Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala Leu Arg Thr Ala Lys 
420 425 430 

Asp Phe Met Glu Asn Asp Asn Leu Asp Asn val Asp Tyr Glu Ala Val 
435 440 445 

Arg Gly Leu Ala Gly lie Lys Glu Ala Glu val Glu lie Ala Gly Asn 
450 455 460 

Glu Tyr Lys Leu Ala Val val Ser Gly Ala Ala Asn val Phe Glu Leu 
465 470 475 480 

val Lys Ser Gly Lys lie Asn Asp Tyr His Phe lie Glu Val Met Ala 
485 490 495 

cys Pro Gly Gly cys val Asn Gly Gly Gly Gin Pro His lie ser Ala 
500 505 510 

Glu Asp Ser Asp Lys Met Asp lie Arg Glu val Arg Ala Ser Val Leu 
515 520 525 

Tyr Asn Gin Asp Lys Asn Leu Glu Lys Arg Lys ser His Gin Asn ser 
530 535 540 

Ala Leu Leu Lys Met Tyr Glu Ser Tyr Met Gly Lys Pro Gly His Gly 
545 550 555 560 

Arg Ala His Glu Leu Leu His Met Lys Tyr Lys Lys 
565 570 

<210> 111 
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<212> prt 
<213> C- tetani 

<400> 111 

Met lie val Phe Glu Asn Gin Leu Lys Lys Leu Lys Tyr Leu val Leu 
15 10 15 

Lys Glu val Ala Lys Met Thr Leu Glu Asp Arg Leu Gly Glu Glu Asp 
20 25 ~ 30 

lie Glu Arg lie Ser Phe Asp lie lie Lys Gly Asp Lys Ala Glu Tyr 
35 40 45 

Arg cys Cys Val Tyr Lys Glu Arg Ala lie Val Tyr Glu Arg Ala Lys 
50 55 60 

Leu Ala Thr Gly Cys Leu Pro Asn Gly Gin Val Ala Glu Glu Phe Val 
65 70 75 80 

His Val Glu Asp Asp Asp Gin lie lie Tyr Val lie Asp Ala Ala cys 
85 90 95 

Asp Lys cys Pro lie Asn Lys Tyr val Val Thr Glu Ala Cys Arg Gly 
100 105 110 

Cys Leu Gin His Lys Cys Met Glu val Cys Pro Ala Gly Ser lie Asn 
115 120 125 

Arg Ala Ala Gly Lys Ala Tyr lie Asn His Glu Thr Cys Lys Glu Cys 
130 135 140 

Gly Leu Cys Glu Ser Ala cys Pro Tyr Asn Ala lie Ala Glu val Met 
145 150 155 160 

Arg Pro cys Arg Arg Ala cys Pro Thr Gly Ala Leu Gin Met Asn Leu 
165 170 175 

Glu Asp Asn Lys Ala Thr lie Asn Lys Glu Asp cys lie Asn Cys Gly 
180 185 190 

Ser cys Met ser val cys Pro Phe Gly Ala lie ser Asp Lys Ser Tyr 
195 200 205 

lie Val Asp lie Thr Lys Ala Leu Lys Asn Asn Lys Lys Val Tyr Ala 
210 215 220 

Met Val Ala Pro Ala lie Thr Gly Gin Phe Gly Lys Asp Val ser Val 
225 230 235 240 

Gly Lys Met Lys Asn Ala Phe Lys Ala Met Gly Phe Glu Asp Met Leu 
245 250 255 

Glu val Ala cys Gly Ala Asp Ala val Ala Ala His Glu ser Glu Glu 
260 265 270 

Phe lie Glu Arg Leu Glu ser Gly Lys Lys Tyr Met Thr Thr ser Cys 
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280 285 



cys Pro Gly Phe Leu Gly Tyr lie Glu Lys Lys Phe Pro Asp Gin Leu 
290 295 300 



Glu Asn Val Ser Asn Thr val ser Pro Met Val Ala lie Gly Arg Met 
305 310 315 320 



lie Lys Lys Glu Tyr Glu Asp ser Val Val val Phe val Gly Pro cys 
325 330 335 



Thr Ala Lys Lys Ala Glu lie Lys Arg Lys Gly lie Lys Asp Ala val 
340 345 350 



Asp Tyr val Met Thr Phe Glu Glu lie Ala Ala Leu Met Gly Ala Phe 
355 360 365 



Glu lie Asp Pro Ala Glu Cys Glu Glu Glu Asp lie Asn Asp Gly Ser 
370 375 380 



Asn Tyr Gly Arg Gly Phe Ala Gin Gly Gly Gly val Val Ser Ala lie 
385 390 395 400 



Gin Asn cys lie Lys Asp Lys Glu Gly lie Lys Phe Asn Pro Leu Arg 
405 410 415 



val ser Gly Pro Asp Gin lie Lys Arg Ala Met lie Met Ala Lys val 
420 425 430 



Gly Lys Leu ser Glu Asn Phe lie Glu Gly Met Met Cys Glu Gly Gly 
435 440 445 



Cys lie Gly Gly Pro Ala Thr Met Val Ser Ala Val Lys Ala Lys Ala 
450 455 460 



Pro Leu Met Lys Phe Ser Lys Ser Ser Thr lie Lys Asp Val Lys Asp 
465 470 475 480 



Asn Glu val Leu Asp Lys Tyr Lys Asp lie Asn Met Glu Arg 
485 490 



<210> 112 

<211> 448 

<212> PRT 

<213> c. tetani 

<400> 112 

Met His Asn Asp Tyr Arg Glu lie Phe Lys Arg Leu ser Lys ser Tyr 
1 5 10 15 



Tyr Asp Asp Thr Phe Glu Lys Glu Val Glu Asn lie Leu ser Ser His 
20 25 30 



Ser Met Asp Arg Glu Lys Leu Ala Lys lie lie Ser lie Leu Cys Gly 
35 40 45 



Val Asn lie Glu His Ser Glu Asn Tyr lie Ser Asn Leu Lys Asn Ala 



Page 190 



WO 2005/072262 PCT/US2005/001983 

„_ 050118 CIP Sequence Listing 

J'QJS )!.Sfe 60 

lie Lys Asn Tyr Thr Ala Ser Ala Glu Lys val val Thr Lys Leu Pro 
65 70 75 80 

cys ser Thr Gin cys Ala Lys Asp Gly Asp lie lie cys Glu Lys ser 
85 90 95 

cys Pro val Asn Ala lie Phe Arg Asp Pro Asn Asp Asn Asn lie Tyr 
100 105 110 

lie Asn Asp Glu Leu cys Leu Asp cys Gly Leu Cys Val Arg Asn Cys 
115 120 125 

Pro Ser Gly Ser lie Leu Asp Lys Lys Glu Phe lie Pro Leu Ala Glu 
130 135 140 

Leu Leu Lys Ser Glu Ser lie Val lie Ala Ala Val Ala Pro Ala lie 
145 150 155 160 

Met Gly Gin Phe Gly Glu Asn Thr Thr lie Asn Gin Leu Arg Thr Ala 
165 170 175 

Phe Lys Lys Leu Gly Phe Thr Asp Met Val Glu Val Ala Phe Phe Ala 
180 185 190 

Asp Met Leu Thr Leu Lys Glu Ala val Glu Tyr Asp His Phe Val Lys 
195 200 205 

Asp Glu Gin Asp Phe Met lie Thr Ser Cys Cys cys Pro Met Trp val 
210 215 220 

Gly Met Leu Lys Lys Val Tyr Asn Asp Leu val Lys Tyr Val Ser Pro 
225 230 235 240 

Ser Val ser Pro Met lie Ala Ala Gly Arg Val Leu Lys Leu Leu Asn 
245 250 255 

Pro Asn Cys Lys Val Val Phe val Gly Pro Cys lie Ala Lys Lys Ala 
260 265 270 

Glu Ala Arg Glu Lys Asp Leu Leu Gly Asp lie Asp Phe val Leu Thr 
275 280 285 

Phe Thr Glu Leu Arg Asp lie Phe Asp Val Phe Asp lie Gin Pro Glu 
290 ~ 295 300 

Asn Leu Glu Glu Asp Phe Ser ser Glu Tyr Ala Ser Lys Gly Gly Arg 
305 310 315 320 

Leu Tyr Ala Arg Thr Gly Gly val ser He Ala Val Ser Glu Ala lie 
325 330 335 

Glu Lys Leu Phe Pro Asn Lys Tyr Lys Phe Leu Lys Thr lie Gin Ala 
340 345 350 
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J^^JrSrUttySsUl^W-rfys ser Leu Leu Asp Lys He Lys Gin Glu 
355 360 365 

Asp lie Ser Ala Asn Phe Val Glu Gly Met Gly cys val Gly Gly cys 
370 375 380 

val Gly Gly Pro Lys Val lie lie Asp Pro ser Glu Gly Arg Asn Ala 
385 - 390 395 400 

Val Asn Asn Phe Ala Glu Asn ser ser lie Lys val ser val Asp Ser 
405 410 415 

Asn Cys Met Asn Asp lie Leu Ser Lys lie Asn lie Asn Ser Val Glu 
420 425 430 

Asp Phe Lys Asp Lys Asp Lys lie Ser lie Phe Glu Arg Glu Phe Lys 
435 440 445 

<210> 113 

<211> 261 

<212> PRT 

<213> Pyrococcus furiosus 

<400> 113 

Met Gly Lys val Arg lie Gly Phe Tyr Ala Leu Thr ser Cys Tyr Gly 
1 5 10 15 

Cys Gin Leu Gin Leu Ala Met Met Asp Glu Leu Leu Gin Leu lie Pro 
20 25 30 

Asn Ala Glu lie Val Cys Trp Phe Met lie Asp Arg Asp Ser lie Glu 
35 40 45 

Asp Glu Lys val Asp lie Ala Phe lie Glu Gly Ser Val Ser Thr Glu 
50 55 60 

Glu Glu Val Glu Leu Val Lys Lys lie Arg Glu Asn Ala Lys lie Val 
65 70 75 80 

val Ala Val Gly Ala Cys Ala Val Gin Gly Gly Val Gin Ser Trp Ser 
85 90 95 

Glu Lys Pro Leu Glu Glu Leu Trp Lys Lys val Tyr Gly Asp Ala Lys 
100 105 110 

val Lys Phe Gin Pro Lys Lys Ala Glu Pro val ser Lys Tyr lie Lys 
115 120 125 

val Asp Tyr Asn lie Tyr Gly Cys Pro Pro Glu Lys Lys Asp Phe Leu 
130 135 140 

Tyr Ala Leu Gly Thr Phe Leu lie Gly Ser Trp Pro Glu Asp lie Asp 
145 150 155 160 

Tyr Pro Val Cys Leu Glu cys Arg Leu Asn Gly His Pro cys lie Leu 
165 ~ 170 175 
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185 



ng 
Ala 



190 



Gly 



cys 



Asn Ala Arg Cys Pro Gly Phe Gly val Ala cys lie Gly cys Arg Gly 
195 " 200 205 



Ala lie Gly Tyr Asp val Ala Trp Phe Asp ser Leu Ala Lys val Phe 
210 215 220 



Lys Glu Lys Gly Met Thr Lys Glu Glu lie lie Glu Arg Met Lys Met 
225 ' 230 235 240 



Phe Asn Gly His Asp Glu Arg Val Glu Lys Met Val Glu Lys lie Phe 
245 " 250 255 



Ser Gly Gly Glu Gin 



<210> 114 
<211> 252 
<212> PRT 

<213> Escherichia coli 
<400> 114 

Met Ser Pro val Leu Thr Gin His Val Ser Gin Pro lie Thr Leu Asp 
15 10 15 



Glu Gin Thr Gin Lys Met Lys Arg His Leu Leu Gin Asp lie Arg Arg 
20 25 30 



ser Ala Tyr val Tyr Arg val Asp cys Gly Gly cys Asn Ala cys Glu 
35 40 45 



lie Glu lie Phe Ala Ala lie Thr Pro Val Phe Asp Ala Glu Arg Phe 
50 55 60 



Gly lie Lys Val Val Ser Ser Pro Arg His Ala Asp lie Leu Leu Phe 
65 70 75 80 



Thr Gly Ala val Thr Arg Ala Met Arg Met Pro Ala Leu Arg Ala Tyr 
85 ~ 90 95 



Glu ser Ala Pro Asp His Lys lie Cys Val Ser Tyr Gly Ala cys Gly 
100 105 110 



Val Gly Gly Gly lie Phe His Asp Leu Tyr Ser Val Trp Gly Gly ser 
115 120 125 



Asp Thr lie Val Pro lie Asp Val Trp lie Pro Gly Cys Pro Pro Thr 
130 135 140 



Pro Ala Ala Thr lie His Gly Phe Ala Val Ala Leu Gly Leu Leu Gin 
145 150 155 160 



Gin Lys He His Ala val Asp Tyr Arg Asp Pro Thr Gly val Thr Met 



260 



165 



170 



175 
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' mW,PU^iM herb UA Wtf e Pro Pro Ser Gin Arg lie Ala lie Glu 
180 185 " 190 

Arg Glu Ala Arg Arg Leu Ala Gly Tyr Arg Gin Gly Arg Glu lie Cys 
195 " 200 205 

Asp Arg Leu Leu Arg His Leu ser Asp Asp Pro Thr Gly Asn Arg Val 
210 215 220 

Asn Thr Trp Leu Arg Asp Ala Asp Asp Pro Arg Leu Asn ser lie Val 
225 230 235 240 

Gin Gin Leu Phe Arg Val Leu Arg Gly Leu His Asp 
245 250 

<210> 115 
<211> 236 
<212> PRT 

<213> Methanothermobacter theritiautotrophicus 
<400> 115 

Met Ala Glu Glu Asn Ala Lys Pro Arg lie Gly Tyr lie His Leu ser 
15 10 15 

Gly Cys Thr Gly Asp Ala Met Ser Leu Thr Glu Asn Tyr Asp lie Leu 
20 25 30 

Ala Glu Leu Leu Thr Asn Met Val Asp lie Val Tyr Gly Gin Thr Leu 
35 40 45 

val Asp Leu Trp Glu Met Pro Glu Met Asp Leu Ala Leu Val Glu Gly 
50 55 60 

Ser Val Cys Leu Gin Asp Glu His Ser Leu His Glu Leu Lys Glu Leu 
65 70 75 80 

Arg Glu Lys Ala Lys Leu Val cys Ala Phe Gly Ser cys Ala Gin Thr 
85 90 95 

Gly Cys Phe Thr Arg Tyr Ser Arg Gly Gly Gin Gin Ala Gin Pro Ser 
100 105 110 

His Glu Ser Phe Val Pro lie Ala Asp Leu lie Asp Val Asp Leu Ala 
115 120 125 

lie Pro Gly cys Pro Pro ser Pro Glu lie lie Ala Lys Ala val val 
130 135 140 

Ala Leu Leu Asn Asn Asp Met Glu Tyr Leu Gin Pro Met Leu Asp Leu 
145 150 155 160 

Ala Gly Tyr Thr Glu Ala Cys Gly Cys Asp Leu Gin Thr Lys Val Val 
165 170 175 

Asn Gin Gly Leu Cys Thr Gly Cys Gly Thr Cys Ala Met Ala Cys Gin 
180 185 190 
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SChTXi^feiy W Ws#. *fir Asn Gly Arg Pro Glu Leu Asn sen Asp 
195 200 205 

Arg cys lie Lys Cys Gly lie Cys Tyr Val Gin Cys Pro Arg ser Trp 
210 215 " 220 

Trp Pro Glu Glu Gin He Lys Lys Glu Leu Gly Leu 
225 230 235 

<210> 116 
<211> 259 
<212> PRT 

<213> Methanosarcina barken" 
<400> 116 

Met Ala Asn Lys lie Lys Leu Gly His val His Leu ser Gly cys Thr 
15 10 15 

Gly Cys Leu val Ser val Ala Asp Asn Tyr Gin Gly Phe Leu Lys lie 
20 25 30 

Leu Asp Asp Tyr Ala Asp Leu Val Tyr cys Leu Thr Leu Ala Asp val 
35 40 45 

Arg His lie Pro Glu Met Asp Val Ala Leu val Glu Gly ser Val cys 
50 55 60 

lie Gin Asp Arg Glu Ser Val Glu Asp lie Lys Glu Thr Arg Lys Lys 
65 70 75 80 

ser Arg lie val val Ala Leu Gly ser cys Ala Ser Tyr Gly Asn lie 
85 90 95 

Thr Arg Phe Cys Arg Gly Gly Gin His Asn His Pro Gin His Glu Ser 
100 " ' 105 110 

Tyr Leu Pro lie Gly Asp Leu lie Asp Val Asp val Tyr lie Pro Gly 
115 120 125 

Cys Pro Pro Ser Pro Glu Leu lie Arg Asn Val Ala lie Met Ala Tyr 
130 135 " 140 

Leu Leu Leu Glu Gly Asn Glu Glu Gin Lys Asp Leu Ala Gly Arg Tyr 
145 150 155 160 

Leu Lys pro Leu Met Asp Leu Ala Lys Arg Gly Thr Thr Gly cys Phe 
165 170 175 

Cys Asp Leu Met Asp Asp val He Asn Gin Gly Leu Cys lie Gly cys 
180 185 190 

Gly lie cys Ala Ala Ser Cys Pro Val Arg Ala lie Thr His Glu Phe 
195 200 205 

Gly Lys Pro Gin Gly Asp Leu Asn Leu Cys lie Lys Cys Gly Ser Cys 
210 215 220 

Page 195 



WO 2005/072262 PCT/US2005/001983 

, 050118 CIP Sequence Listing 
PDyrjfllVWWS^ W''»f*r Phe Phe Asn Pro Asp Val He Ser Glu 
225 230 235 240 

Phe Glu Ser lie Asn Glu lie lie Ala Gly Ala Leu Lys Glu Gly Glu 
245 250 255 

Lys Asp Asp 



<210> 117 

<211> 142 

<212> PRT 

<213> Rhodospi rill urn rubrum 

<400> 117 

Met Asn Phe Leu ser Arg Met Ser Lys Lys Ser Pro Trp Leu Tyr Arg 
1 5 10 15 

lie Asn Ala Gly Ser Cys Asn Gly cys Asp Val Glu Leu Ala Thr Thr 
20 25 30 

Ala Cys lie Pro Arg Tyr Asp val Glu Arg Leu Gly Cys Gin Tyr Cys 
35 40 45 

Gly Ser Pro Lys His Ala Asp lie Val Leu Val Thr Gly Pro Leu Thr 
50 55 60 

Ala Arg val Lys Asp Lys val Leu Arg val Tyr Glu Glu lie Pro Asp 
65 70 ~ 75 80 

Pro Lys val Thr val Ala lie Gly Val cys Pro lie Ser Gly Gly val 
85 90 95 

Phe Arg Glu Ser Tyr Ser lie Val Gly Pro lie Asp Arg Tyr Leu Pro 
100 105 110 

Val Asp Val Asn Val Pro Gly Cys Pro Pro Arg Pro Gin Ala lie lie 
115 120 ~ 125 

Glu Gly lie Ala Lys Ala lie Glu lie Trp Ala Gly Arg lie 
130 135 140 

<210> 118 
<211> 428 
<212> PRT 

<213> Pyrococcus furiosus 
<400> 118 

Met Lys Asn Leu Tyr Leu Pro lie Thr lie Asp His lie Ala Arg val 
15 10 15 

Glu Gly Lys Gly Gly Val Glu lie lie lie Gly Asp Asp Gly val Lys 
20 25 30 

Glu Val Lys Leu Asn lie lie Glu Gly Pro Arg Phe Phe Glu Ala lie 
35 40 45 

Thr lie Gly Lys Lys Leu Glu Glu Ala Leu Ala lie Tyr Pro Arg lie 
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cys ser Phe cys ser Ala Ala His Lys Leu Thr Ala Leu Glu Ala Ala 
65 70 75 80 

Glu Lys Ala Val Gly Phe Val Pro Arg Glu Glu lie Gin Ala Leu Arg 
85 90 95 

Glu val Leu Tyr lie Gly Asp Met lie Glu Ser His Ala Leu His Leu 
100 105 110 

Tyr Leu Leu val Leu Pro Asp Tyr Arg Gly Tyr Ser Ser Pro Leu Lys 
115 120 125 

Met val Asn Glu Tyr Lys Arg Glu lie Glu lie Ala Leu Lys Leu Lys 
130 135 140 

Asn Leu Gly Thr Trp Met Met Asp lie Leu Gly ser Arg Ala lie His 
145 150 155 160 

Gin Glu Ash Ala val Leu Gly Gly Phe Gly Lys Leu Pro Glu Lys Ser 
165 170 175 

Val Leu Glu Lys Met Lys Ala Glu Leu Arg Glu Ala Leu Pro Leu Ala 
180 185 ~ 190 

Glu Tyr Thr Phe Glu Leu Phe Ala Lys Leu Glu Gin Tyr Ser Glu val 
195 200 205 

Glu Gly Pro lie Thr His Leu Ala Val Lys Pro Arg Gly Asp Ala Tyr 
210 215 220 

Gly lie Tyr Gly Asp Tyr lie Lys Ala Ser Asp Gly Glu Glu Phe Pro 
225 230 235 240 

ser Glu Lys Tyr Arg Asp Tyr lie Lys Glu Phe val val Glu His ser 
245 250 255 

Phe Ala Lys His Ser His Tyr Lys Gly Arg Pro Phe Met Val Gly Ala 
260 265 270 

lie Ser Arg val He Asn Asn Ala Asp Leu Leu Tyr Gly Lys Ala Lys 
275 280 285 

Glu Leu Tyr Glu Ala Asn Lys Asp Leu Leu Lys Gly Thr Asn Pro Phe 
290 295 300 

Ala Asn Asn Leu Ala Gin Ala Leu Glu lie val Tyr Phe lie Glu Arg 
305 310 315 320 

Ala lie Asp Leu Leu Asp Glu Ala Leu Ala Lys Trp Pro lie Lys Pro 
325 330 335 

Arg Asp Glu Val Glu lie Lys Asp Gly Phe Gly val Ser Thr Thr Glu 
340 345 350 
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«AI& *pm wt*g>my w& li'se-A/'al Tyr Ala Leu Lys val Glu Asn Gly Arg 
355 360 365 

val Ser Tyr Ala Asp lie He Thr Pro Thr Ala Phe Asn Leu Ala Met 
370 375 380 

Met Glu Glu His Val Arg Met Met Ala Glu Lys His Tyr Asn Asp Asp 
385 390 395 400 

Pro Glu Arg Leu Lys lie Leu Ala Glu Met Val val Arg Ala Tyr Asp 
405 410 ~ 415 

Pro cys lie ser cys ser Val His val Val Arg Leu 
420 425 

<210> 119 
<211> 555 
<212> PRT 

<213> Escherichia coli 
<400> 119 

Met Asn val Asn ser ser Ser Asn Arg Gly Glu Ala lie Leu Ala Ala 
15 10 15 

Leu Lys Thr Gin Phe Pro Gly Ala Val Leu Asp Glu Glu Arg Gin Thr 
20 25 30 

Pro Glu Gin val Thr lie Thr Val Lys lie Asn Leu Leu Pro Asp Val 
35 40 45 

val Gin Tyr Leu Tyr Tyr Gin His Asp Gly Trp Leu Pro Val Leu Phe 
50 55 60 

Gly Asn Asp Glu Arg Thr Leu Asn Gly His Tyr Ala Val Tyr Tyr Ala 
65 70 75 80 

Leu ser Met Glu Gly Ala Glu Lys cys Trp lie Val val Lys Ala Leu 
85 90 95 

val Asp Ala Asp ser Arg Glu Phe Pro ser val Thr Pro Arg Val Pro 
100 105 110 

Ala Ala val Trp Gly Glu Arg Glu lie Arg Asp Met Tyr Gly Leu lie 
115 120 125 

Pro Val Gly Leu Pro Asp Gin Arg Arg Leu Val Leu Pro Asp Asp Trp 
130 135 140 

Pro Glu Asp Met His Pro Leu Arg Lys Asp Ala Met Asp Tyr Arg Leu 
145 150 ~ 155 160 

Arg pro Glu Pro Thr Thr Asp ser Glu Thr Tyr Pro Phe lie Asn Glu 
165 170 175 

Gly Asn ser Asp Ala Arg val lie Pro val Gly Pro Leu His lie Thr 
180 ~ 185 190 
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•Bff^Sia^i«y^WS-Aie Arg Leu Phe Val Asp Gly Glu Gin lie 
195 200 205 

Val Asp Ala Asp Tyr Arg Leu Phe Tyr Val His Arg Gly Met Glu Lys 
210 215 220 

Leu Ala Glu Thr Arg Met Gly Tyr Asn Glu val Thr Phe Leu ser Asp 
225 230 235 240 

Arg Val Cys Gly lie Cys Gly Phe Ala His Ser Val Ala Tyr Thr Asn 
245 250 255 

Ser val Glu Asn Ala Leu Gly lie Glu Val Pro Gin Arg Ala His Thr 
260 265 270 

lie Arg ser lie Leu Leu Glu val Glu Arg Leu His ser His Leu Leu 
275 280 285 

Asn Leu Gly Leu ser Cys His Phe Val Gly Phe Asp Thr Gly Phe Met 
290 295 300 

Gin Phe Phe Arg val Arg Glu Lys Ser Met Thr Met Ala Glu Leu Leu 
305 310 315 320 

lie Gly Ser Arg Lys Thr Tyr Gly Leu Asn Leu lie Gly Gly val Arg 
325 330 335 

Arg Asp lie Leu Lys Glu Gin Arg Leu Gin Thr Leu Lys Leu Val Arg 
340 345 350 

Glu Met Arg Ala Asp Val Ser Glu Leu Val Glu Met Leu Leu Ala Thr 
355 360 365 

Pro Asn Met Glu Gin Arg Thr Gin Gly lie Gly lie Leu Asp Arg Gin 
370 375 380 

lie Ala Arg Asp Leu Arg Phe Asp His Pro Tyr Ala Asp Tyr Gly Asn 
385 ~ 390 395 400 

lie Pro Lys Thr Leu Phe Thr Phe Thr Gly Gly Asp val Phe ser Arg 
405 410 415 

Val Met Val Arg Val Lys Glu Thr Phe Asp Ser Leu Ala Met Leu Glu 
420 425 430 

Phe Ala Leu Asp Asn Met Pro Asp Thr Pro Leu Leu Thr Glu Gly Phe 
435 440 445 

ser Tyr Lys Pro His Ala Phe Ala Leu Gly Phe Val Glu Ala Pro Arg 
450 455 460 

Gly Glu Asp Val His Trp Ser Met Leu Gly Asp Asn Gin Lys Leu Phe 
465 470 475 480 

Arg Trp Arg Cys Arg Ala Ala Thr Tyr Ala Asn Trp Pro Val Leu Arg 
485 490 495 
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Tyr Met Leu Arg Gly Asn Thr Val Ser Asp Ala Pro Leu lie lie Gly 
500 505 510 

Ser Leu Asp Pro Cys Tyr Ser Cys Thr Asp Arg val Thr Leu Val Asp 
515 520 525 

Val Arg Lys Arg Gin Ser Lys Thr Val Pro Tyr Lys Glu lie Glu Arg 
530 535 540 

Tyr Gly lie Asp Arg Asn Arg ser Pro Leu Lys 
545 550 ~ 555 

<210> 120 
<211> 405 
<212> PRT 

<213> Methanothermobacter thermautotrophi cus 
<400> 120 

Met Ser Glu Arg lie Val lie Ser Pro Thr Ser Arg Gin Glu Gly His 
15 10 15 

Ala Glu Leu Val Met Glu val Asp Asp Glu Gly lie Val Thr Lys Gly 
20 25 30 

Arg Tyr Phe Ser lie Thr Pro Val Arg Gly Leu Glu Lys lie Val Thr 
35 40 45 

Gly Lys Ala Pro Glu Thr Ala Pro Val lie Val Gin Arg lie Cys Gly 
50 55 60 

val cys Pro lie Pro His Thr Leu Ala Ser val Glu Ala lie Asp Asp 
65 70 75 80 

ser Leu Asp lie Glu Val Pro Lys Ala Gly Arg Leu Leu Arg Glu Leu 
85 90 95 

Thr Leu Ala Ala His His Val Asn ser His Ala lie His His Phe Leu 
100 105 110 

lie Ala Pro Asp Phe Val Pro Glu Asn Leu Met Ala Asp Ala lie Asn 
115 120 125 

Ser val Ser Glu lie Arg Lys Asn Ala Gin Tyr Val Val Asp Met val 
130 135 140 

Ala Gly Glu Gly lie His Pro ser Asp val Arg lie Gly Gly Met Ala 
145 150 155 160 

Asp Asn lie Thr Glu Leu Ala Arg Lys Arg Leu Tyr Ala Arg Leu Lys 
165 170 175 

Gin Leu Lys Pro Lys val Asp Glu His val Glu Leu Met lie Gly Leu 
180 185 190 

lie Glu Asp Lys Gly Leu Pro Lys Gly Leu Gly val His Asn Gin Pro 
195 200 205 
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Thr Leu Ala Ser His Gin lie Tyr Gly Asp Arg Thr Lys Phe Asp Leu 
210 215 220 

Asp Arg Phe Thr Glu Val Met Pro Glu Ser Trp Tyr Asp Asp Pro Glu 
225 230 235 240 

lie Ala Lys Arg Ala cys Ser Thr lie Pro Leu Tyr Asp Gly Arg Asn 
245 250 255 

val Glu val Gly Pro Arg Ala Arg Met Val Glu Phe Gin Gly Phe Lys 
260 265 270 

Glu Arg Gly val Val Ala Gin His val Ala Arg Ala Leu Glu Met Lys 
275 280 285 

Thr Ala Leu Ala Arg Ala lie Glu lie Leu Asp Glu Leu Asp Thr ser 
290 ~ 295 300 

Ala Pro Val Arg Ala Asp Phe Asp Glu Arg Gly Thr Gly Lys Leu Gly 
305 310 315 320 

Val Gly Ala lie Glu Gly Pro Arg Gly Leu Asp val His Met Ala Gin 
325 330 335 

val Glu Asn Gly Lys lie Gin Phe Tyr Ser Ala Leu Val Pro Thr Thr 
340 345 350 

Trp Asn lie Pro Thr Met Gly Pro Ala Thr Glu Gly Phe His His Glu 
355 360 365 

Tyr Gly Pro His Val lie Arg Ala Tyr Asp Pro Cys Leu ser cys Ala 
370 375 380 

Thr His Val Met val val Asp Asp Glu Asp Arg Ser val lie Arg Asp 
385 390 395 400 

Glu Met Val Arg Leu 
405 

<210> 121 
<211> 456 
<212> PRT 

<213> Methanosarcina barkeri 
<400> 121 

Met Thr Lys val val Glu lie ser Pro Thr Thr Arg His Glu Gly His 
15 10 15 

ser Lys Leu Thr Leu Lys Val Asn Asp Glu Gly lie Val Glu Arg Gly 
20 25 30 

Asp Trp Leu ser Thr Thr Pro val Arg Gly lie Glu Lys Leu Ala lie 
35 40 ~ 45 

Gly Lys Thr Met Asp Gin Val Pro Lys lie Ala Ser Arg Val Cys Gly 
50 55 60 
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He cys Pro lie Ala His Thr Leu Ala Gly lie Glu Ala Met Glu Ala 
65 70 75 80 

ser He Gly cys Glu He Pro Lys Asp Ala Lys Leu Leu Arg Val lie 
85 90 95 

Leu His Ala Ala Asn Arg Leu His Ser His Ala Leu His Asn lie Leu 
100 105 110 

lie Leu Pro Asp Phe Tyr lie Pro Asp Thr Glu Thr Lys lie Asn Pro 
115 120 125 

Phe ser Lys Glu Gin Pro Leu Arg ser val Ala val Arg lie Phe Arg 
130 135 ~ 140 

lie Arg Glu lie Ala Gin Thr lie Gly Ala val Ala Gly Gly Glu Ala 
145 ~ 150 155 160 

lie His Pro Ser Asn Pro Arg val Gly Gly Met Tyr Arg Asn Val Ser 
165 ~ 170 ^ 175 

Ser Arg Ala Lys Gin Lys lie Ala Asp Leu Ala Lys Glu Gly Leu Val 
180 185 190 

Leu Ala His Glu Gin Met Glu Phe Met lie Glu Val lie Arg Asn Met 
195 200 205 

Gin Asp Arg Glu Phe val Glu val Ala Gly Lys Gin lie Pro Leu Pro 
210 215 220 

Lys Thr Leu Gly Tyr His Asn Gin Gly val Met Ala Thr Ala Pro Met 
225 230 235 240 

Tyr Gly ser ser ser Leu Asp Glu Lys Pro Met Trp Asp Phe Thr Arg 
245 250 255 

Trp Arg Glu Thr Arg Pro Trp Asp Trp Tyr Met ser Glu Glu Thr lie 
260 265 270 

Asp Leu Glu Asp Ser Ser Tyr Pro lie Gly Gly Thr Thr Lys Val Gly 
275 280 285 

Thr Lys Val Asn Pro Arg Met Glu Ala cys Asn Thr val Pro Thr Tyr 
290 295 300 

Asp Gly Gin Pro val Glu Val Gly Pro Arg Ala Arg Leu Ala Thr Phe 
305 " 310 315 320 

Lys His Phe Thr Glu Lys Gly Thr Phe Ala Gin His lie Ala Arg Gin 
325 330 335 

Met Glu Tyr Thr Asp Cys Tyr Tyr Thr lie Leu Asn cys Leu Glu Asn 
340 345 350 

Leu Asp Thr Ser Gly Lys Val Leu Ala Asp Thr lie Pro Leu Gly Asn 
355 360 365 
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Gly Ser Met Gly Trp Ala Ala Asn Glu Ala Pro Arg Gly Thr Asp Val 
370 375 380 

His Leu Ala Arg val Lys Asp Gly Lys Val Leu Arg Tyr Glu Met Leu 
385 ~ 390 395 400 

val Pro Thr Thr Trp Asn Phe Pro Thr cys ser Arg Ala Leu Thr Gly 
405 410 415 

Ala Pro Trp Gin lie Ala Glu Met Val lie Arg Ala Tyr Asp Pro cys 
420 425 430 

Val Ser Cys Ala Thr His Met lie Val Val Asn Glu Glu Asp Arg lie 
435 440 445 

Val Ala Gin Lys Leu Met Gin Trp 
450 455 

<210> 122 
<211> 361 
<212> PRT 

<213> Rhodospi rillum rubrum 
<400> 122 

Met Ser Thr Tyr Thr lie Pro val Gly Pro Leu His Val Ala Leu Glu 
15 10 15 

Glu Pro Met Tyr Phe Arg lie Glu Val Asp Gly Glu Lys Val val ser 
20 25 30 

Val Asp lie Thr Ala Gly His val His Arg Gly lie Glu Tyr Leu Ala 
35 40 ~ 45 

Thr Lys Arg Asn lie Tyr Gin Asn lie Val Leu Thr Glu Arg Val cys 
50 55 60 

Ser Leu Cys Ser Asn Ser His Pro Gin Thr Tyr Cys Met Ala Leu Glu 
65 70 75 80 

Ser lie Thr Gly Met Val val Pro Pro Arg Ala Gin Tyr Leu Arg Val 
85 90 95 

lie Ala Asp Glu Thr Lys Arg val Ala ser His Met Phe Asn val Ala 
100 ~ 105 110 

lie Leu Ala His lie Val Gly Phe Asp Ser Leu Phe Met His Val Met 
115 120 125 

Glu Ala Arg Glu lie Met Gin Asp Thr Lys Glu Ala Val Phe Gly Asn 
130 135 140 

Arg Met Asp lie Ala Ala Met Ala lie Gly Gly val Lys Tyr Asp Leu 
145 150 155 160 

Asp Lys Asp Gly Arg Asp Tyr Phe lie Gly Gin Leu Asp Lys Leu Glu 
165 170 175 
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Pro Thr Leu Arg Asp Glu He lie Pro Leu Tyr Gin Thr Asn Pro Ser 
180 185 190 

lie Val Asp Arg Thr Arg Gly He Gly val Leu ser Ala Ala Asp cys 
195 200 205 

Val Asp Tyr Gly Leu Met Gly Pro Val Ala Arg Gly Ser Gly His Ala 
210 215 220 

Tyr Asp Val Arg Lys Gin Ala Pro Tyr Ala val Tyr Asp Arg Leu Asp 
225 230 235 240 

Phe Glu Met Ala Leu Gly Glu His Gly Asp val Trp Ser Arg Ala Met 
245 250 255 

Val Arg Trp Gin Glu Ala Leu Thr ser lie Gly Leu lie Arg Gin Cys 
260 265 270 

Leu Arg Asp Met Pro Asp Gly Pro Thr Lys Ala Gly Pro Val Pro Pro 
275 280 285 

lie Pro Ala Gly Glu Ala val Ala Lys Thr Glu Ala Pro Arg Gly Glu 
290 295 300 

Leu lie Tyr Tyr Leu Lys Thr Asn Gly Thr Asp Arg Pro Glu Arg Leu 
305 310 315 ~ 320 

Lys Trp Arg Val Pro Thr Tyr Met Asn Trp Asp Ala Leu Asn Val Met 
325 330 335 

Met Ala Gly Ala Arg lie ser Asp lie Pro Leu lie val Asn ser lie 
340 345 350 

Asp Pro Cys lie Ser Cys Thr Glu Arg 
355 360 

<210> 123 
<211> 505 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 
<400> 123 

Met Ala Leu Gly Leu Leu Ala Glu Leu Arg Ala Gly Gin Ala Val Ala 
15 10 15 

Cys Ala Arg Arg Thr Asn Ala Pro Ala His Pro Ala Ala val Val Pro 
20 25 30 

cys Leu Pro ser Arg Ala Gly Lys Phe Phe Asn Leu ser Gin Lys val 
35 40 45 

Pro ser ser Gin ser Ala Arg Gly ser Thr lie Arg Val Ala Ala Thr 
50 55 60 
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Ala Thr Asp Ala val Pro His Trp Lys Leu Ala Leu Glu Glu Leu Asp 
65 70 75 80 

Lys Pro Lys Asp Gly Gly Arg Lys Val Leu lie Ala Gin val Ala Pro 
85 90 95 

Ala Val Arg Val Ala lie Ala Glu ser Phe Gly Leu Ala Pro Gly Ala 
100 105 110 

Val Ser Pro Gly Lys Leu Ala Thr Gly Leu Arg Ala Leu Gly Phe Asp 
115 120 125 

Gin Val Phe Asp Thr Leu Phe Ala Ala Asp Leu Thr lie Trp Glu Glu 
130 135 140 

Gly Thr Glu Leu Leu His Arg Leu Lys Glu His Leu Glu Ala His Pro 
145 150 ~ 155 160 

His ser Asp Glu Pro Leu Pro Met Phe Thr ser cys Cys Pro Gly Trp 
165 170 175 

Val Ala Met Met Glu Lys ser Tyr Pro Glu Leu lie Pro Phe val ser 
180 ' 185 190 

Ser Cys Lys Ser Pro Gin Met Met Met Gly Ala Met Val Lys Thr Tyr 
195 200 205 

Leu Ser Glu Lys Gin Gly lie Pro Ala Lys Asp lie Val Met Val Ser 
210 215 220 

Val Met Pro Cys Val Arg Lys Gin Gly Glu Ala Asp Arg Glu Trp Phe 
225 230 235 240 

Cys val ser Glu Pro Gly val Arg Asp val Asp His val lie Thr Thr 
245 ~ 250 255 

Ala Glu Leu Gly Asn lie Phe Lys Glu Arg Gly lie Asn Leu Pro Glu 
260 265 270 

Leu Pro Asp ser Asp Trp Asp Gin Pro Leu Gly Leu Gly Ser Gly Ala 
275 280 285 

Gly Val Leu Phe Gly Thr Thr Gly Gly Val Met Glu Ala Ala Leu Arg 
290 295 300 

Thr Ala Tyr Glu lie Val Thr Lys Glu Pro Leu Pro Arg Leu Asn Leu 
305 310 315 320 

Ser Glu Val Arg Gly Leu Asp Gly lie Lys Glu Ala ser val Thr Leu 
325 330 335 

Val pro Ala Pro Gly Ser Lys Phe Ala Glu Leu val Ala Glu Arg Leu 
340 345 350 

Ala His Lys Val Glu Glu Ala Ala Ala Ala Glu Ala Ala Ala Ala Val 
355 360 365 
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Glu Gly Ala Val Lys Pro Pro lie Ala Tyr Asp Gly Gly Gin Gly Phe 
370 375 380 

ser Thr Asp Asp Gly Lys Gly Gly Leu Lys Leu Arg val Ala val Ala 
385 390 395 400 

Asn Gly Leu Gly Asn Ala Lys Lys Leu lie Gly Lys Met val ser Gly 
405 410 415 

Glu Ala Lys Tyr Asp Phe Val Glu lie Met Ala cys Pro Ala Gly Cys 
420 425 430 

val Gly Gly Gly Gly Gin Pro Arg ser Thr Asp Lys Gin lie Thr Gin 
435 440 445 

Lys Arg Gin Ala Ala Leu Tyr Asp Leu Asp Glu Arg Asn Thr Leu Arg 
450 455 460 

Arg Ser His Glu Asn Glu Ala Val Asn Gin Leu Tyr Lys Glu Phe Leu 
465 470 475 480 

Gly Glu Pro Leu Ser His Arg Ala His Glu Leu Leu His Thr His Tyr 
485 490 495 

Val Pro Gly Gly Ala Glu Ala Asp Ala 
500 505 

<210> 124 
<211> 19 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 
<400> 124 

Gly Ala Gly val lie Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala 
1 5 10 15 

Leu Arg Thr 



<210> 125 
<211> 19 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> synthetic sequence 
<400> 125 

Gly Gly Gly Ala lie Phe cys Ala Thr Gly Gly Val Met Glu Ala Ala 
15 10 15 



Val Arg Ser 



<210> 126 
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<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 126 

Gly Gly Ala Thr lie Phe Gly Val Thr Gly Gly Val Met G~lu Ala Ala 
1 5 10 15 



Leu Arg Phe 



<210> 127 

<211> 19 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 127 

Gly Ala Gly Ala lie Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala 
1 5 10 15 



Leu Arg Ser 



<210> 128 

<211> 19 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 128 

Gly Ala Gly Ala lie Phe Gly Ala Thr Gly Gly val Met Glu Ala Ala 
1 5 10 ' 15 



lie Arg ser 



<210> 129 

<211> 19 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 129 

Gly Ala Ala val lie Phe Gly val Thr Gly Gly val Met Glu Ala Ala 
1 5 10 15 

Leu Arg Thr 



<210> 130 
<211> 19 
<212> PRT 

<213> Artificial sequence 
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<220> 

<223> Synthetic sequence 
<400> 130 

Gly Ala Gly Gin lie Phe Ala Ala Thr Gly Gly Val Met Glu Ala Ala 
15 10 15 

Ser Arg Thr 



<210> 131 
<211> 19 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 
<400> 131 

Gly Ala Ala val lie Phe Gly Thr Thr Gly Gly Val Met Glu Ala Ala 
15 10 15 

Leu Arg Thr 



<210> 132 
<211> 19 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 
<400> 132 

Gly Ala Ala Pro lie Phe Gly Val Thr Gly Gly Val lie Glu Ala Ala 
15 10 15 

Leu Arg Thr 



<210> 133 
<211> 19 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 
<400> 133 

Gly Ala Gly Val lie Phe Gly Thr Thr Gly Gly Val Met Glu Ala Ala 
15 10 15 

Leu Arg Ser 



<210> 134 

<211> 19 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 
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<400> 134 

Gly Ala Gly val lie Phe Gly Ala Thr Gly Gly Val Met Glu Ala Ala 

1 5 10 15 - 

lie Arg Thr 



<210> 135 

<211> 19 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 135 

Ser Ala Gly Asn Leu Phe Gly Val Thr Gly Gly Val Met Glu Ala Ala 
1 5 10 15 

lie Arg Thr 



<210> 136 

<211> 19 

<212> PRT 

<213> Artificial sequence 

<220> 

<223> Synthetic sequence 

<400> 136 

Gly Ala Gly Ala lie Phe Gly Ala Thr Gly Gly Val Met Glu Ala Ala 
1 5 10 15 



Leu Arg Thr 



<210> 137 

<211> 19 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 137 

Gly Ala Gly val Leu Phe Gly Thr Thr Gly Gly val Met Glu Ala Ala 
1 5 10 15 



Leu Arg Thr 



<210> 138 

<211> 19 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 138 



Page 209 



WO 2005/072262 PCT/US2005/001983 

*©T^»0«/M Ala Ala 

1 5 10 15 

Leu Arg Thr 



<210> 139 

<211> 19 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 139 

Gly Ala Gly Val Leu Phe Gly Thr Thr Gly Gly Val Met Glu Ala Ala 
1 5 10 15 



val Arg Thr 



<210> 140 

<211> 19 

<212> PRT 

<213> Artificial sequence 

<220> 

<223> Synthetic sequence 

<400> 140 

Gly Ala Gly Thr lie Phe Gly Thr Thr Gly Gly Val Met Glu Ala Ala 
1 5 10 15 



Leu Arg Thr 



<210> 141 
<211> 19 
<212> PRT 

<213> Artificial sequence 

<220> 

<223> Synthetic construct 
<400> 141 

Gly Gly Gly val Leu Phe Gly Thr Thr Gly Gly val Met Glu Ala Ala 
15 10 15 



Leu Arg Thr 



<210> 142 

<211> 5 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic construct 

<400> 142 

Thr lie Met Glu Glu 
1 5 
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<210> 143 

<211> 5 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic construct 

<400> 143 

Thr lie val G"lu Glu 
1 5 

<210> 144 

<211> 5 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> synthetic sequence 

<400> 144 

Thr lie Trp Glu Glu 
1 5 

<210> 145 

<211> 5 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 145 

Thr lie cys Glu Glu 
1 5 

<210> 146 

<211> 5 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> synthetic sequence 

<400> 146 

Val lie Met Glu Glu 
1 5 

<210> 147 

<211> 5 

<212> PRT 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 147 

Thr Ala Arg Leu Glu 
1 ~ 5 

<210> 148 

<211> 260 

<212> DNA 

<213> Chlamydomonas reinhardtii 
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<400> 148 
gcagttgggt 


caggggctgg 


cgacgcgctg 


ctgacgcgca 


agtgaatggc 


ccaacaagtc 


60 


gcctcgcggt 


cgctgtcggc 


gccaaacccg 


cagctgcatc 


caccagattc 


acttgttaga 


I/O 


tcgacctagg 


ttgcgggacc 


ggaggcggct 


cgctgtgcaa 


gcgcggtgac 


ctcgtacggc 


180 


ggcatggatc 


gccatctcga 


ttcgcgcggc 


agaatcgggc 


cccgcgcaca 


tttaagccgc 


240 


gggcgagact 


catttcgtta 










260 


<210> 149 
<211> 1181 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> Synthetic sequence 










<400> 149 
gccagaagga 


gcgcagccaa 


accaggatga 


tgtttgatgg 


ggtatttgag 


cacttgcaac 


60 


ccttatccgg 


aagccccctg 


gcccacaaag 


gctaggcgcc 


aatgcaagca 


gttcgcatgc 


120 


agcccctgga 


gcggtgccct 


cctgataaac 


cggccagggg 


gcctatgttc 


tttacttttt 


180 


tacaagagaa 


gtcactcaac 


atcttaaaat 


ggccaggtga 


gtcgacgagc 


aagcccggcg 


240 


gatcaggcag 


cgtgcttgca 


gatttgactt 


gcaacgcccg 


cattgtgtcg 


acgaaggctt 


300 


ttggctcctc 


tgtcgctgtc 


tcaagcagca 


tctaaccctg 


cgtcgccgtt 


tccatttgca 


360 


aaataaccaa 


gctgaccagc 


accattccaa 


tgctcaccgc 


gcgcgacgtc 


gccggagcgg 


420 


tcgagttctg 


gaccgaccgg 


ctcgggttct 


cccgggactt 


cgtggaggac 


gacttcgccg 


480 


gtgtggtccg 


ggacgacgtg 


accctgttca 


tcagcgcggt 


ccaggaccag 


gtgagtcgac 


540 


gagcaagccc 


ggcggatcag 


gcagcgtgct 


tgcagatttg 


acttgcaacg 


cccgcattgt 


600 


gtcgacgaag 


gcttttggct 


cctctgtcgc 


tgtctcaagc 


agcatctaac 


cctgcgtcgc 


660 


cgtttccatt 


tgcaggacca 


ggtggtgccg 


gacaacaccc 


tggcctgggt 


gtgggtgcgc 


720 


ggcctggacg 


agctgtacgc 


cgagtggtcg 


gaggtcgtgt 


ccacgaactt 


ccgggacgcc 


780 


tccgggccgg 


ccatgaccga 


gatcggcgag 


cagccgtggg 


ggcgggagtt 


cgccctgcgc 


840 


gacccggccg 


gcaactgcgt 


gcacttcgtg 


gccgaggagc 


aggactaacc 


gacgtcgacc 


900 


cactctagag 


gatcgatccc 


cgctccgtgt 


aaatggaggc 


gctcgttgat 


ctgagccttg 


960 


ccccctgacg 


aacggcggtg 


gatggaagat 


actgctctca 


agtgctgaag 


cggtagctta 


"1 A "5 A 
1020 


gctccccgtt 


tcgtgctgat 


cagtcttttt 


caacacgtaa 


aaagcggagg 


agttttgcaa 


1080 


ttttgttggt 


tgtaacgatc 


ctccgttgat 


tttggcctct 


ttctccatgg 


gcgggctggg 


1140 


cgtatttgaa 


gcttaattaa 


ctcgaggggg 


ggcccggtac 


c 




1181 


<210> 150 
<2U> 260 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> Synthetic sequence 










<400> 150 
ccgacgtcga 


cccactctag 


aggatcgatc 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


60 


atctgagcct 


tgccccctga 


cgaacggcgg 


tggatggaag atactgctct 
Page 212 


caagtgctga 


120 
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in 

agcggtagct tagctccccg tttcgtgctg atcagtcttt ttcaacacgt aaaaagcgga 180 
ggagttttgc aattttgttg gttgtaacga tcctccgttg attttggcct ctttctccat 240 
gggcgggctg ggcgtatttg 260 

<210> 151 

<211> 520 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> synthetic sequence 

<400> 151 



ccgacgtcga 


cccactctag 


aggatcgatc 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


60 


atctgagcct 


tgccccctga 


cgaacggcgg 


tggatggaag 


atactgctct 


caagtgctga 


120 


agcggtagct 


tagctccccg 


tttcgtgctg 


atcagtcttt 


ttcaacacgt 


aaaaagcgga 


180 


ggagttttgc 


aattttgttg 


gttgtaacga 


tcctccgttg 


attttggcct 


ctttctccat 


240 


gggcgggctg 


ggcgtatttg 


gcagttgggt 


caggggctgg 


cgacgcgctg 


ctgacgcgca 


300 


agtgaatggc 


ccaacaagtc 


gcctcgcggt 


cgctgtcggc 


gccaaacccg 


cagctgcatc 


360 


caccagattc 


acttgttaga 


tcgacctagg 


ttgcgggacc 


ggaggcggct 


cgctgtgcaa 


420 


gcgcggtgac 


ctcgtacggc 


ggcatggatc 


gccatctcga 


ttcgcgcggc 


agaatcgggc 


480 


cccgcgcaca 


tttaagccgc 


gggcgagact 


catttcgtta 






520 



<210> 152 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 152 

atccgtagtt atccttatgg ccatcttagc 30 

<210> 153 

<211> 30 

<212> DNA 

<213> Artificial sequence 

<220> 

<223> Synthetic sequence 

<400> 153 

cgtgcatcga ttaacagctt ctggacctga 30 

<210> 154 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 154 

ttaaacgtcg tacgtccaag tataactaag 30 



<210> 155 
<211> 30 
<212> DNA 
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<220> 

<223> Synthetic sequence 



<400> 155 

aatctgatac atgctattca gatcttacaa 



30 



<210> 156 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> synthetic sequence 

<400> 156 

tcttccatcg taaatctagc atcgattagc 30 



<210> 157 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 157 

atctgtaata atctagtcga ggcattcaag 30 



<210> 158 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> synthetic sequence 

<400> 158 

aactggctta aatcgttaac aatcgtgtga 30 



<210> 159 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 159 

gatttaacat aactgtcgat taccgtgcga 30 



<210> 160 

<211> 30 

<212> DNA 

<213> Artificial sequence 

<220> 

<223> synthetic sequence 

<400> 160 

tatgcttgac aatcgtaatc ctggtgacaa 30 



<210> 161 

<211> 30 

<212> DNA 

<213> Artificial sequence 



<220> 
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<400> 161 

taacaagaat ctggctaatc aatcgatgca 



<210> 162 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 162 

gtagtcggaa tagttactaa cgaggattcg 30 



<210> 163 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 163 

aaatgtctac tcgactagta aatcgtaact 30 



<210> 164 
<211> 290 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 
<400> 164 

gcagttgggt caggggctgg cgacgcgctg ctgacgcgca agtgaatggc ccaacaagtc 60 
gcctcgcggt cgctgtcggc gccaaacccg cagctgcatc caccagattc acttgttaga 120 
tcgacctagg ttgcgggacc ggaggcggct cgctgtgcaa gcgcggtgac ctcgtacggc 180 
ggcatggatc gccatctcga ttcgcgcggc agaatcgggc cccgcgcaca tttaagccgc 240 
gggcgagact catttcgtta atccgtagtt atccttatgg ccatcttagc 290 



<210> 165 

<211> 580 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> synthetic sequence 

<400> 165 



cgtgcatcga 


ttaacagctt 


ctggacctga 


ccgacgtcga 


cccactctag 


aggatcgatc 


60 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


atctgagcct 


tgccccctga 


cgaacggcgg 


120 


tggatggaag 


atactgctct 


caagtgctga 


agcggtagct 


tagctccccg 


tttcgtgctg 


180 


atcagtcttt 


ttcaacacgt 


aaaaagcgga 


ggagttttgc 


aattttgttg 


gttgtaacga 


240 


tcctccgttg 


attttggcct 


ctttctccat 


gggcgggctg 


ggcgtatttg 


gcagttgggt 


300 


caggggctgg 


cgacgcgctg 


ctgacgcgca 


agtgaatggc 


ccaacaagtc 


gcctcgcggt 


360 


cgctgtcggc 


gccaaacccg 


cagctgcatc 


caccagattc 


acttgttaga 


tcgacctagg 


420 


ttgcgggacc 


ggaggcggct 


cgctgtgcaa 


gcgcggtgac 


ctcgtacggc 


ggcatggatc 


480 
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^ccSat'cfcfc^^ cccgcgcaca tttaagccgc gggcgagact 540 

catttcgtta ttaaacgtcg tacgtccaag tatgactaag 580 



<210> 166 

<211> 566 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 166 



aatctgatac 


atgctattca 


gatcttacaa 


ccgacgtcga 


cccactctag 


aggatcgatc 


60 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


atctgagcct 


tgccccctga 


cgaacggcgg 


120 


tggatggaag 


atactgctct 


caagtgctga 


agcggtagct 


tagctccccg 


tttcgtgctg 


180 


atcagtcttt 


ttcaacacgt 


aaaaagcgga 


ggagttttgc 


aattttgttg 


gttgtaacga 


240 


tcctccgttg 


attttggcct 


ctttctccat 


gggcgggctg 


ggcgtatttg 


gcagttgggt 


300 


caggggctgg 


cgacgcgctg 


ctgacgcgca 


agtgaatggc 


ccaacaagtc 


gcctcgcggt 


. 360 


cgctgtcggc 


gccaaacccg 


cagctgcatc 


caccagattc 


acttgttaga 


tcgacctagg 


420 


ttgcgggacc 


ggaggcggct 


cgctgtgcaa 


gcgcggtgac 


ctcgtacggc 


ggcatggatc 


480 


gccatctcga 


ttcgcgcggc 


agaatcgggc 


cccgcgcaca tttaagccgc 


gggcgatctt 


540 


ccatcgtaaa 


tctagcatcg 


attagc 








566 


<210> 167 
<211> 290 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> Synthetic sequence 










<400> 167 
atctgtaata 


atctagtcga 


ggcattcaag 


ccgacgtcga 


cccactctag 


aggatcgatc 


60 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


atctgagcct 


tgccccctga 


cgaacggcgg 


120 


tggatggaag 


atactgctct 


caagtgctga 


agcggtagct 


tagctccccg 


tttcgtgctg 


180 


atcagtcttt 


ttcaacacgt 


aaaaagcgga 


ggagttttgc 


aattttgttg 


gttgtaacga 


240 


tcctccgttg 


attttggcct 


ctttctccat 


gggcgggctg 


ggcgtatttg 




290 


<210> 168 
<211> 1181 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> Synthetic sequence 










<400> 168 
gccagaagga 


gcgcagccaa 


accaggatga 


tgtttgatgg 


ggtatttgag 


cacttgcaac 


60 


ccttatccgg 


aagccccctg 


gcccacaaag 


gctaggcgcc 


aatgcaagca 


gttcgcatgc 


120 


agcccctgga 


gcggtgccct 


cctgataaac 


cggccagggg 


gcctatgttc 


tttacttttt 


180 


tacaagagaa 


gtcactcaac 


atcttaaaat 


ggccaggtga 


gtcgacgagc 


aagcccggcg 


240 


gatcaggcag 


cgtgcttgca 


gatttgactt 


gcaacgcccg 


cattgtgtcg 


acgaaggctt 


300 


ttggctcctc 


tgtcgctgtc 


tcaagcagca 


tctaaccctg cgtcgccgtt 
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ggatggccaa 


gctgaccagc 


gccgttccgg 


tgctcaccgc 


gcgcgacgtc 


gccggagcgg 


420 


tcgagttctg 


gaccgaccgg 


ctcgggttct 


cccgggactt 


cgtggaggac 


gacttcgccg 


480 


gtgtggtccg 


ggacgacgtg 


accctgttca 


tcagcgcggt 


ccaggaccag 


gtgagtcgac 


540 


gagcaagccc 


ggcggatcag 


gcagcgtgct 


tgcagatttg 


acttgcaacg 


cccgcattgt 


600 


gtcgacgaag 


gcttttggct 


cctctgtcgc 


tgtctcaagc 


agcatctaac 


cctgcgtcgc 


660 


cgtttccatt 


tgcaggacca 


ggtggtgccg 


gacaacaccc 


tggcctgggt 


gtgggtgcgc 


720 


ggcctggacg 


agctgtacgc 


cgagtggtcg 


gaggtcgtgt 


ccacgaactt 


ccgggacgcc 


780 


tccgggccgg 


ccatgaccga 


gatcggcgag 


cagccgtggg 


ggcgggagtt 


cgccctgcgc 


840 


gacccggccg 


gcaactgcgt 


gcacttcgtg 


gccgaggagc 


aggactaacc 


gacgtcgacc 


900 


cactctagag 


gatcgatccc 


cgctccgtgt 


aaatggaggc 


gctcgttgat 


ctgagccttg 


960 


ccccctgacg 


aacggcggtg 


gatggaagat 


actgctctca 


agtgctgaag 


cggtagct"ta 


-1 ATA 

1020 


gctccccgtt 


tcgtgctgat 


cagtcttttt 


caacacgtaa 


aaagcggagg 


agttttgcaa 


1080 


ttttgttggt 


tgtaacgatc 


ctccgttgat 


tttggcctct 


ttctccatgg 


gcgggctggg 


1140 


cgtatttgaa 


gcttaattaa 


ctcgaggggg 


ggcccggtac 


c 




1181 


<210> 169 
<211> 290 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> synthetic sequence 










<400> 169 
gcagttgggt 


caggggctgg 


cgacgcgctg 


ctgacgcgca 


agtgaatggc 


ccaacaagtc 


60 


gcctcgcggt 


cgctgtcggc 


gccaaacccg 


cagctgcatc 


caccagattc 


acttgttaga 


120 


tcgacctagg 


ttgcgggacc 


ggaggcggct 


cgctgtgcaa 


gcgcggtgac 


ctcgtacggc 


180 


ggcatggatc 


gccatctcga 


ttcgcgcggc 


agaatcgggc 


cccgcgcaca 


tttaagccgc 


240 


gggcgagact 


catttcgtta 


aactggctta 


aatcgttaac 


aatcgtgtga 




290 


<210> 170 
<211> 566 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> synthetic sequence 










<400> 170 
gatttaacat 


aactgtcgat 


taccgtgcga 


ccgacgtcga 


cccactctag 


aggatcgatc 


60 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


atctgagcct 


tgccccctga 


cgaacggcgg 


120 


tggatggaag 


atactgctct 


caagtgctga 


agcggtagct 


tagctccccg 


tttcgtgctg 


180 


atcagtcttt 


ttcaacacgt 


aaaaagcgga 


ggagttttgc 


aattttgttg 


gttgtaacga 


240 


tcctccgttg 


attttggcct 


ctttctccat 


gggcgggctg 


ggcgtatttg 


gcagttgggt 


300 


caggggctgg 


cgacgcgctg 


ctgacgcgca 


agtgaatggc 


ccaacaagtc 


gcctcgcggt 


360 


cgctgtcggc 


gccaaacccg 


cagctgcatc 


caccagattc 


acttgttaga 


tcgacctagg 


420 


ttgcgggacc 


ggaggcggct 


cgctgtgcaa 


gcgcggtgac 


ctcgtacggc 


ggcatggatc 


480 
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lifdEaFc^ '^kMtgggc cccgcgcaca tttaagccgc gggcgatatg 540 

cttgacaatc gtaatcctgg tgacaa 566 

<210> 171 

<211> 290 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 

<400> 171 



taacaagaat 


ctggctaatc 


aatcgatgca 


ccgacgtcga 


cccactctag 


aggatcgatc 


60 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


atctgagcct 


tgccccctga 


cgaacggcgg 


120 


tggatggaag 


atactgctct 


caagtgctga 


agcggtagct 


tagctccccg 


tttcgtgctg 


180 


atcagtcttt 


ttcaacacgt 


aaaaagcgga 


ggagttttgc 


aattttgttg 


gttgtaacga 


240 


tcctccgttg 


attttggcct 


ctttctccat 


gggcgggctg 


ggcgtatttg 




290 


<210> 172 
<211> 381 
<212> DNA 

<213> chlamydomonas reinhardtii 










<400> 172 
atggccatgg 


ctatgcgctc 


caccttcgcc 


gcccgcgttg 


gcgctaagcc 


cgctgtccgc 


60 


ggtgctcgcc 


ccgccagccg 


catgagctgc 


atggcctaca 


aggtcaccct 


gaagacccct 


120 


tcgggcgaca 


agaccattga 


gtgccccgct 


gacacctaca 


tcctggacgc 


tgctgaggag 


180 


gccggcctgg 


acctgcccta 


ctcttgccgc 


gctggtgctt 


gctccagctg 


cgccggcaag 


240 


gtcgctgccg 


gcaccgtcga 


ccagtcggac 


cagtccttcc 


tggacgatgc 


ccagatgggc 


300 


aacggcttcg 


tgctgacctg 


cgtggcctac 


cccacctcgg 


actgcaccat 


ccagacccac 


360 


caggaggagg 


ccctgtacta 


a 








381 


<210> 173 
<211> 1494 
<212> DNA 

<213> chlamydomonas reinhardtii 










<400> 173 
atgtcggcgc 


tcgtgctgaa 


gccctgcgcg 


gccgtgtcta 


ttcgcggcag 


ctcctgcagg 


60 


gcgcggcagg 


tcgccccccg 


cgctccgctc 


yi_.ciyi_.i~ciyi_ci 


t-i_y l y i_y i_y l 


dyLLL L Ly v_cl 




acacttgagg 


cgcccgcacg 


ccgcctaggc 


aacgtcgctt 


gcgcggctgc 


cgcacccgct 


180 


gcggaggcgc 


ctttgagtca 


tgtccagcag 


gcgctcgccg 


agcttgccaa 


gcccaaggac 


240 


gaccccacgc 


gcaagcacgt 


ctgcgtgcag 


gtggctccgg 


ccgttcgtgt 


cgctattgcc 


300 


gagaccctgg 


gcctggcgcc 


gggcgccacc 


acccccaagc 


agctggccga 


gggcctccgc 


360 


cgcctcggct 


ttgacgaggt 


gtttgacacg 


ctgtttggcg 


ccgacctgac 


catcatggag 


420 


gagggcagcg 


agctgctgca 


ccgcctcacc 


gagcacctgg 


aggcccaccc 


gcactccgac 


480 


gagccgctgc 


ccatgttcac 


cagctgctgc 


cccggctgga 


tcgctatgct 


ggagaaatct 


540 


tacccggacc 


tgatccccta 


cgtgagcagc 


tgcaagagcc 


cccagatgat 


gctggcggcc 


600 


atggtcaagt 


cctacctagc 


ggaaaagaag 


ggcatcgcgc 


caaaggacat 


ggtcatggtg 


660 


tccatcatgc 


cctgcacgcg 


caagcagtcg 


gaggctgacc 


gcgactggtt 


ctgtgtggac 


720 
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gccgacccca 


ccctgcgcca 


gctggaccac 


gtcatcacca 


ccgtggagct 


gggcaacatc 


780 


ttcaaggagc 


gcggcatcaa 


cctggccgag 


ctgcccgagg 


gcgagtggga 


caatccaatg 


840 


ggcgtgggct 


cgggcgccgg 


cgtgctgttc 


ggcaccaccg 


gcggtgtcat 


ggaggcggcg 


900 


ctgcgcacgg 


cctatgagct 


gttcacgggc 


acgccgctgc 


cgcgcctgag 


cctgagcgag 


960 


gtgcgcggca 


tggacggcat 


caaggagacc 


aacatcacca 


tggtgcccgc 


gcccgggtcc 


1020 


aagtttgagg 


agctgctgaa 


gcaccgcgcc 


gccgcgcgcg 


ccgaggccgc 


cgcgcacggc 


1080 


acccccgggc 


cgctggcctg 


ggacggcggc 


gcgggcttca 


ccagcgagga 


cggcaggggc 


1140 


ggcatcacac 


tgcgcgtggc 


cgtggccaac 


gggctgggca 


acgccaagaa 


gctgatcacc 


1200 


aagatgcagg 


ccggcgaggc 


caagtacgac 


tttgtggaga 


tcatggcctg 


ccccgcgggc 


1260 


tgtgtgggcg 


gcggcggcca 


gccccgctcc 


accgacaagg 


ccatcacgca 


gaagcggcag 


1320 


gcggcgctgt 


acaacctgga 


cgagaagtcc 


acgctgcgcc 


gcagccacga 


gaacccgtcc 


1380 


atccgcgagc 


tgtacgacac 


gtacctcgga 


gagccgctgg 


gccacaaggc 


gcacgagctg 


1440 


ctgcacaccc 


actacgtggc 


cggcggcgtg 


gaggagaagg 


acgagaagaa 


gtga 


1494 


<210> 174 
<211> 1725 
<212> DNA 

<213> clostriduim pasteuranum 










<400> 174 
atgaaaacaa 


taattataaa 


tggtgtacag 


tttaatactg 


atgaagacac 


tactatatta 


60 


aaatttgcac 


gagacaacaa 


tattgatata 


tctgcactgt 


gttttttaaa 


taattgtaat 


120 


aatgacataa 


ataagtgtga 


aatatgtact 


gtagaggtag 


agggtactgg 


attagtaaca 


180 


gcctgtgata 


cattaattga 


ggatggtatg 


attataaaca 


caaattccga 


tgctgtcaac 


240 


gaaaaaatta 


aatctagaat 


atctcaatta 


ttagacatac 


atgaattcaa 


atgtggtcct 


300 


tgcaatagaa 


gagaaaactg 


tgaattctta 


aaacttgtta 


taaaatataa 


agcaagagct 


360 


tctaaaccat 


ttttacctaa 


agataagact 


gaatatgtag 


atgaaagaag 


taaatcatta 


420 


actgtagata 


ggacaaaatg 


cttattatgt 


ggaagatgtg 


ttaatgcctg 


tggaaaaaat 


480 


actgaaacct 


atgcaatgaa 


atttttaaac 


aaaaatggta 


aaactataat 


tggagcagag 


540 


gatgaaaaat 


gctttgatga 


tactaattgt 


ctattatgtg 


gtcaatgtat 


aatcgcctgt 


600 


ccagtagcag 


cattatcgga 


aaaatcacac 


atggatagag 


taaaaaatgc 


cttaaatgcc 


660 


cctgaaaaac 


atgtaatagt 


agctatggct 


ccatctgtca 


gagcttctat 


aggtgaactt 


720 


tttaatatgg 


gatttggcgt 


tgacgtaaca 


ggaaaaattt 


atactgcttt 


aagacagctt 


780 


ggatttgata 


aaatattcga 


tataaacttc 


ggagcagata 


tgacaattat 


ggaagaggct 


840 


acagaattag 


ttcaaagaat 


agagaataat 


ggacctttcc 


caatgtttac 


atcttgctgc 


900 


ccaggttggg 


taagacaagc 


tgaaaattat 


tatcctgaat 


tactaaataa 


tctttcatca 


960 


gctaaatcac 


ctcaacaaat 


ttttggtact 


gctagtaaaa 


cttattatcc 


ttctatatct 


1020 


ggtcttgacc 


caaagaatgt 


atttactgta 


acagttatgc 


cctgtacttc 


aaaaaaattt 


1080 


gaagcagata 


gaccacaaat 


ggaaaaagac 


ggcctaagag 


atatagatgc 


tgttataact 


1140 


actcgagaat 


tagcaaaaat 


gattaaagat 


gctaaaatac 


catttgctaa 


acttgaagat 


1200 


agcgaagcag 


accctgctat 


gggagaatac 


agcggtgctg gtgccatatt 
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ggcggagtta tggaagcagc 


tttaagaagt 


gcaaaagact 


ttgctgaaaa 


egctgaaett 


1320 


gaagatatag 


aatataagca 


agttagagga 


ttaaatggta 


taaaagaagc 


tgaagtagaa 


1380 


ataaataaca 


acaaatataa 


tgtagctgtt 


ataaatggtg 


cttcaaattt 


atttaagttt 


1440 


atria aatrtn 


y LaLVjau laa 


cgaaaaacaa 


tatcatttca 


tagaagtaat 


ggcttgtcat 


1500 


ggaggatgtg 


taaatggtgg 


tggacagect 


catgtaaacc 


caaaagattt 


agaaaaagta 


1560 


gacataaaaa 


aagtaagagc 


ttctgtattg 


tataatcagg 


atgaacatct 


ttccaagaga 


1620 


aaatctcatg 


aaaatactgc 


attagttaaa 


atgtatcaaa 


attattttgg 


caaaccaggt 


1680 


gaaggtcgtg 


cccatgaaat 


attacacttt 


aaatataaaa 


aataa 




1725 


<210> 175 
<211> 1265 
<212> DNA 

<213> Desulfovibrio vulgaris 










<400> 175 
angagccg la 


ccgtcatgga 


gcgcatcgaa 


tatgagatgc 


acactccgga 


ccccaaggcc 


60 


gaxccggaca 


age tccac tt 


cgtccagatc 


gacgaggcaa 


agtgcatagg 


ctgcgacacc 


120 


ng l Lcgcagu 


actgccccac 


cgccgccatc 


ttcggcgaaa 


tgggegaace 


gcactccatt 


180 


ccccacarcg 


aggegtgeat 


caactgcggc 


cagtgcctca 


cgcactgccc 


cgagaacgcc 


240 


a tctacgagg 


cacagtegtg 


gtgcctgaag 


tcgagaagaa 


gctgaaggac 


ggcaaggtga 


300 


aatgcatcgc 


catgcccgcc 


cccgccgtgc 


getatgeact 


gggcgacgcc 


ttcggcatgc 


360 


ccgtcggttc 


cgtcaccacc 


ggcaagatgc 


tcgcggccct 


gcagaagctc 


ggcttcgctc 


420 


attgctggga 


caccgagttc 


accgctgacg 


tgaccatctg 


ggaagagggg 


tccgagttcg 


480 


tggaacgcct 


caccaagaag 


agegacatge 


cgctgccgca 


gttcacctcg 


tgctgccccg 


540 


gctggcagaa 


gtatgecgag 


acctactacc 


ccgaactgct 


gccgcacttc 


tccacgtgca 


600 


agtcgcccat 


eggcatgaac 


ggcgcactgg 


cgaagaccta 


eggegcagag 


eggatgaagt 


660 


acgaccccaa 


gcaggtctac 


accgtctcca 


tcatgccctg 


categcaaag 


aagtacgaag 


720 


ggttgcgtcc 


cgaactgaag 


tccagcggca 


tgegegacat 


cgacgccacg 


ctgaccaccc 


780 


gtgagctggc 


ctacatgatc 


aagaaggccg 


gtatcgactt 


cgcgaaactc 


cccgacggca 


840 


agcgtgacag 


cctcatgggt 


gaatccaccg 


gcggtgccac 


catcttcggc 


gtcaccggcg 


900 


gcgtcatgga 


agcggcactc 


cgcttcgcct 


aegaagcegt 


caccggcaag 


aagcccgaca 


960 


gctgggactt 


caaggcegtg 


cgcggtcttg 


atggcatcaa 


ggaagccacc 


gtcaaegteg 


1020 


gcggtaccga 


cgtcaaggtc 


gccgtggtgc 


aeggggecaa 


gcggttcaag 


caggtctgeg 


1080 


acgatgtgaa 


ggcgggcaag 


tcgccctatc 


acttcatcga 


atacatggee 


tgccccggcg 


1140 


gctgcgtctg 


tggcggcggt 


cagcccgtca 


tgcccggcgt 


gctcgaagcc 


atggaccgca 


1200 


ccaccacccg 


cctttacgcg 


ggectgaaga 


agcgcctcgc 


catggegage 


gecaacaagg 


1260 


catag 












1265 


<210> 176 
<211> 1407 
<212> DNA 

<213> Entamoeba histolytica 










<400> 176 
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ggacatgacc ataaccatag 


tancaaitt 


60 


gattggtcta 


aatgcatggg 


ttgtggaatg 


tgtgctacta 


aatgtacttt 


tggggtgtta 


120 


gtaaaacaac 


caccaaaaat 


tccaccattt 


gttcagccta 


atagagaaaa 


actctctcaa 


180 


gaaaataccg 


acaagacaag 


agtacttatt 


gatgagtctg 


aatgtactgg 


gtgtggtcaa 


240 


tgttctttgg 


tttgtaactt 


tggttctatt 


acaccaatag 


accatcttgt 


tgatactttt 


300 


aaagctaaag 


aagctggaaa 


gaagcttgtt 


gctatgattg 


caccttcaac 


tcgtttaggt 


360 


gttgctgagg 


ctatgggaat 


gcctattgga 


agtacagcta 


tggctcagtt 


agttcattgt 


420 


ttaagactta 


ttggatttga 


ttatgtattt 


gatgttgatg 


ctggagctga 


taagacaaca 


480 


atggatgatt 


atgccgaagt 


tattgaaatg 


aaaaaagaag 


gaaaaggacc 


tgctattact 


540 


tcctgttgtc 


ctgcttggat 


tgaacttgtt 


gaaaaagaat 


atcctgactt 


aattccaaac 


600 


gtctctactg 


cccgttcacc 


aattggatgt 


ttagctggtt 


gtattaaaag 


aggatgggca 


, 660 


aaggatgtag 


gaattgcagt 


agaagatctt 


■tacactgttg 


gaataatgcc 


ttgtattgct 


720 


aaaaaaacag 


agtctcaaag 


acaacaaatt 


catcaagact 


atgatgcttc 


atgtacttca 


780 


aatgaaattg 


ctgcttattt 


caaaaaacat 


cttccacctg 


aagaatgtaa 


atttacacaa 


840 


gaaagagaag 


aagcacttgc 


taaaactgaa 


gatggtcaat 


gtgatttacc 


atttagacgt 


900 


atttctggtg 


gttctaatat 


ttttggaaag 


actggaggag 


tttgtgaaac 


tgtattgaga 


960 


gtaattgcac 


gtaatgcagg 


agttgattgg 


aacagttgta 


ctgttaacaa 


ggaagaaact 


1020 


tttaaacatg 


ctgcaagtgg 


atcaacaatg 


acaaatcttt 


ctgttgatat 


tggtggaact 


1080 


attatcacag 


gtgctgtttg 


tcatggtggt 


tzatgctatta 


gacatgcttg 


tgaacttatt 


1140 


agaaaaggag 


agttaaaagt 


tgatgttgtt 


gaaatgatgg 


catgtgttgg 


aggttgtctt 


1200 


ggaggagcag 


gtcaaccaaa 


aattccacca 


gcaaagaaac 


ttgagatgga 


taagagaaga 


1260 


gtaatgttag 


atattttaga 


tcaacaaact 


gatattagag 


ctgctaatga 


aaatactgat 


1320 


gttcttggat 


ggattgataa 


acattttgat 


catcaaggtg 


cacatcagca 


tcttcacaca 


1380 


tattttactc 


ccagatatca 


aaactaa 








1407 


<210> 177 
<211> 1350 
<212> DNA 

<213> Scenedesmus obliquus 










<400> 177 
atgcctgagt 


ggcaaccggg 


aggtcggtat 


gctgtttctg 


tccgcccgcc 


agtgaacagg 


60 


cgggctgtgg 


tggcagcaga 


gcgcaggcgc 


cttgttgtgc 


gggcagctgg 


cccaacagca 


120 


gaatgtgatt 


gcccaccagc 


tcccgcgccc 


aaggccccgc 


actggcagca 


gacgctagat 


180 


gagctagcca 


agcctaagga gcagcgcaag 


gtgatgatcg 


cccagatcgc 


accagcagtg 


240 


cgcgtggcta 


ttgcagagac 


catgggactc 


aaccctgggg 


atgtgacagt 


tggccagatg 


300 


gtgaccggcc 


tgcgcatgct 


gggctttgat 


tatgtgtttg 


acacgctgtt 


tggtgctgac 


360 


ctcaccatca 


tggaggaggg 


cacagagcta 


cggcacaggc 


ttcaggacca 


cctggagcag 


420 


caccccaaca 


aggaggagcc 


gctgcccatg 


ttcaccagct 


gctgccctgg 


ctgggtggcc 


480 


atggtggaga 


agtccaaccc 


cgagctcatc 


ccctacctgt 


cttcctgcaa 


gtcgccccag 


540 


atgatgctgg 


gcgcagtcat 


caagaactac 


ttcgctgccg 


aggccggcgc 


caagcctgag 


600 
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gtgcgcaagc agggcgaggc 


tgaccgcgag 


660 


tggttcaaca 


ccacaggggc 


tggcggcgcg 


aacgtggacc 


acgtcatgac 


aactgcagag 


720 


ctgggcaaga 


tctttgtgga 


gcgcggaatc 


aagctgaacg 


acctgcagga 


gtcgcccttt 


780 


gacaaccccg 


tcggcgaggg 


cagcggcggc 


ggcgtgctgt 


tcggcaccac 


tggaggcgtg 


840 


atggaggcgg 


cgctgcgcac 


cgtgtacgaa 


gtggtcacac 


agaagccttt 


ggaccgcatc 


900 


gtctttgagg 


acgtgcgcgg 


cctggagggc 


atcaaggagt 


ccacgctgca 


cctcacccca 


960 


ggccccacca 


gccccttcaa 


ggcctttgca 


ggcgcagacg 


gcaccggcat 


caccctcaac 


1020 


atcgcggtcg 


ccaacggcct 


cggcaatgcc 


aagaagctca 


tcaagcagct 


ggctgcaggc 


1080 


gagagcaagt 


acgacttcat 


cgaggtcatg 


gcctgccccg 


gcggctgcat 


cggcggcggc 


1140 


ggccagccgc 


gcagcgcgga 


caagcagatc 


ctgcagaagc 


gccaggcggc 


catgtacgac 


1200 


ctggacgagc 


gcgcggtgat 


ccggcgcagc 


cacgagaacc 


cgctgattgg 


cgcgctgtat 


1260 


gagaagttcc 


tgggcgagcc 


caacggccac 


aaggcgcacg 


agctgctgca 


cacgcactac 


1320 


gtggccggcg 


gcgtgcccga tgagaagtga 








1350 


<210> 178 
<211> 1311 
<212> DNA 

<213> Chlorella fusca 










>/inns 1 7Q 
<*tUU> X/o 

atgtgttgcc 


ccgtggttgc 


aagtaggcac 


gcagggcgtg 


caaggcatgt 


tgctgtccgt 


60 


gcagcagggc 


caacatctga 


gtgtgattgt 


cctccaacac 


ctcaggccaa 


gctgcctcac 


120 


tggcagcagg 


ctctggatga 


gctcgccaag 


cccaaggaga 


gcaggaggtt 


gatgatcgcg 


180 


caaatcgcct 


ccgctgttcg 


tgtcgctatt 


gctgagacca 


ttggcttggc 


cccaggagat 


240 


gtcaccattg 


ggcagctcgt 


gactgggctg 


cgtatgcttg 


gctttgatta 


tgtctttgac 


300 


accctgtttg 


gtgctgacct 


gaccat"tatg 


gaggagggaa 


cggagctgct 


gcatcgcctg 


360 


caggaccatc 


tggagcagca 


ccccaacaag 


gaggagccac 


tgcccatgtt 


caccagttgc 


420 


tgcccaggct 


gggttgccat 


ggttgaaaag 


agcaatcctg 


agctcatccc 


ctacctgtca 


480 


tcttgcaagt 


cgcctcagat 


gatgcttggg 


gccgttatca 


agaactacta 


tgcacagcag 


540 


gttggagtgc 


agcccagtga 


catctgcaac 


gtgtcagtca 


tgccatgcgt 


acgcaagcag 


600 


ggagaggctg 


accgggagtg 


gttcaacacc 


acaggtgcag 


gccttgcccg 


tgatgttgat 


660 


catgtggtga 


ctactgctga 


ggttggtaag 


atattcctgg 


agcgtggcat 


caagctgaat 


720 


gagctgccag 


agagcaactt 


tgacaacccc 


attggcgagg 


gcacaggtgg 


tgctctgctg 


780 


tttggcacca 


ctggaggtgt 


catggaggca 


gcacttcgca 


cagtctatga 


agtggtgacc 


840 


cagaagccca 


tgggtcgtgt 


tgactttgag 


gaggtgcgag 


gccttgaagg 


aatcaaggag 


900 


gcagagatca 


cactcaagcc 


aggagacgac 


agcccattca 


aagccttcgc 


aggagctgat 


960 


gggcagggca 


tcacgctcaa 


gattgcagta 


gccaatgggc 


ttggcaatgc 


caagaagctc 


1020 


atcaagagcc 


tgtcagaggg 


caaggccaag 


tatgatttca 


ttgaggtcat 


ggcatgccct 


1080 


ggtggctgca 


ttggcggagg 


cggtcagccc 


cgcagtactg 


acaagcagat 


cctgcagaag 


1140 


cgccagcagg 


ctatgtacaa 


cctggatgag 


cgcagtacca 


tccgccgcag 


ccatgataac 


1200 


ccattcatcc 


aggcgctgta 


tgacaagttc 


ctaggcgcac 


ccaacagcca 


caaggcacat 


1260 
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gat e tg atfg© | ia i Gacacasc*a' ai, t g* gg c ag g t 


ggaattccag 


aggagaagtg 


a 


1311 


<210> 179 
<211> 717 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> Green Fluorescent Protein 










<400> 179 


y Ly dyyciy l l 


y L LLdLLyy L 


gtggtcccca 


tcctggtgga 


gctggacggc 


60 


y <i c y t: y d. d l. y 


r*\ f c a sn"i"f~*f~ 
y v~<~cl(~clciy L L 




ggcgagggtg 


agggtgacgc 


cacctacggc 


120 


QCiyL LLjaLLL 


Lyciciy l LLd. l 


r~~\- r\ r~ t~ t~ ^\ r~ t~ 
uy i-.ci\-\-ci\-\- 


ggcaagctgc 


ccgtgccctg 


gcccaccctg 


180 


gtcaccaccc 


tgacctacgg 


tgtgcagtgc 


ttctcccgct 


accccgacca 


catgaagcag 


240 


cacgacttct 


tcaagtccgc 


catgcccgag 


ggctacgtgc 


aggagcgcac 


catcttcttc 


300 


aaggacgacg 


gcaactacaa 


gacccgcgcc 


gaggtcaagt 


tcgagggcga 


caccctggtg 


360 


aaccgcatcg 


agctgaaggg 


catcgacttc 


aaggaggacg 


gcaacatcct 


gggccacaag 


420 


ctggagtaca 


actacaactc 


ccacaacgtg 


tacatcatgg 


ccgacaagca 


gaagaacggc 


480 


atcaaggtga 


acttcaagat 


ccgccacaac 


atcgaggacg 


gctccgtgca 


gctggccgac 


540 


cactaccagc 


agaacacccc 


catcggcgat 


ggccccgtgc 


tgctgcccga 


caaccactac 


600 


ctgtccatcc 


agtccgccct 


gtccaaggac 


cccaacgaga 


agcgcgacca 


catggtcctg 


660 


ctggagttcg 


tcaccgctgc 


cggcatcacc 


cacggcatgg 


acgagctgta 


caagtaa 


717 


<210> 180 
<211> 320 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> Synthetic sequence 










<400> 180 
atccgtagtt 


atccttatgg 


ccatcttagc 


gcagttgggt 


caggggctgg 


cgacgcgctg 


60 


ctgacgcgca 


agtgaatggc 


ccaacaagtc 


gcctcgcggt 


cgctgtcggc 


gccaaacccg 


120 


cagctgcatc 


caccagattc 


acttgttaga 


tcgacctagg 


ttgcgggacc 


ggaggcggct 


180 


cgctgtgcaa 


gcgcggtgac 


ctcgtacggc 


ggcatggatc 


gccatctcga 


ttcgcgcggc 


240 


agaatcgggc 


cccgcgcaca 


tttaagccgc 


gggcgagact 


catttcgtta 


cgtgcatcga 


300 


ttaacagctt 


ctggacctga 










320 


<210> 181 
<211> 580 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> Synthetic sequence 










<400> 181 
ttaaacgtcg 


tacgtccaag 


tataactaag 


ccgacgtcga 


cccactctag 


aggatcgatc 


60 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


atctgagcct 


tgccccctga 


cgaacggcgg 


120 


tggatggaag 


atactgctct 


caagtgctga 


agcggtagct 


tagctccccg 


tttcgtgctg 


180 


atcagtcttt 


ttcaacacgt 


aaaaagcgga 


ggagttttgc aattttgttg 
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tcctccgttg 


attttggcct 


ctttctccat 


gggcgggctg 


ggcgtatttg 


gcagttgggt 


300 


caggggctgg 


cgacgcgctg 


ctgacgcgca 


agtgaatggc 


ccaacaagtc 


gcctcgcggt 


360 


cgctgtcggc gccaaacccg 


LdyCigCdlC 


caccaga ltic 


acT.Lgxx.aga 


xcgaccxagg 




ttgcgggacc 


ggaggcggct 


cgctgtgcaa 


gcgcggtgac 


ctcgtacggc 


ggcatggatc 


480 


gccatctcga ttcgcgcggc 


agaatcgggc 


cccgcgcaca 


tttaagccgc 


gggcgagact 


540 


catttcgtta 


aatctgatac 


atgctattca 


gatcttacaa 






580 


<210> 182 
<211> 580 
<212> DNA 

<213> Artificial sequence 










<220> 

<223> Synthetic sequence 










<400> 182 
tcttccatcg 


taaatctagc 


atcgattagc 


ccgacgtcga 


cccactctag 


aggatcgatc 


60 


cccgctccgt 


gtaaatggag 


gcgctcgttg 


atctgagcct 


tgccccctga 


cgaacggcgg 


120 


tggatggaag 


atactgctct 


caagtgctga 


agcggtagct 


tagctccccg 


tttcgtgctg 


180 


atcagtcttt 


ttcaacacgt 


aaaaagcgga 


ggagttttgc 


aattttgttg 


gttgtaacga 


240 


tcctccgttg 


attttggcct 


ctttctccat 


gggcgggctg 


ggcgtatttg 


gcagttgggt 


300 


caggggctgg 


cgacgcgctg 


ctgacgcgca 


agtgaatggc 


ccaacaagtc 


gcctcgcggt 


360 


cgctgtcggc 


gccaaacccg 


cagctgcatc 


"3 *!i ft +• +■ 

caccagaxxc 


acT.T.gx.x.aga 


xcgacc xagg 


Aon 


ttgcgggacc 


ggaggcggct 


cgctgtgcaa 


gcgcggtgac 


ctcgtacggc 


ggcatggatc 


480 


gccatctcga 


ttcgcgcggc 


agaatcgggc 


cccgcgcaca 


tttaagccgc 


gggcgagact 


540 


catttcgtta 


atctgtaata 


atctagtcga 


ggcattcaag 






580 


<210> 183 
<211> 777 
<212> DNA 
<213> Arti 


ificial sequence 










<220> 

<223> Synthetic sequence 










<400> 183 
atctgtaata 


atc-tagtcga 


ggcattcaag 


atggccaagg 


gcgaggagct 


gttcaccggt 


60 


gtggtcccca 


tcctggtgga 


gctggacggc 


gacgtgaacg 


gccacaagtt 


ctccgtctcc 


120 


ggcgagggtg 


agggtgacgc 


cacctacggc 


aagctgaccc 


tgaagttcat 


ctgcaccacc 


180 


ggcaagctgc 


ccgt:gccctg 


gcccaccctg 


gtcaccaccc 


tgacctacgg 


tgtgcagtgc 


240 


ttctcccgct 


accccgacca 


catgaagcag 


cacgacttct 


tcaagtccgc 


catgcccgag 


300 


ggctacgtgc 


aggagcgcac 


catcttcttc 


aaggacgacg 


gcaactacaa 


gacccgcgcc 


360 


gaggtcaagt 


tcgagggcga 


caccctggtg 


aaccgcatcg 


agctgaaggg 


catcgacttc 


420 


aaggaggacg 


gcaacatcct 


gggccacaag 


ctggagtaca 


actacaactc 


ccacaacgtg 


480 


tacatcatgg 


ccgacaagca 


gaagaacggc 


atcaaggtga 


acttcaagat 


ccgccacaac 


540 


atcgaggacg 


gctccgtgca 


gctggccgac 


cactaccagc 


agaacacccc 


catcggcgat 


600 


ggccccgtgc 


tgctgcccga 


caaccactac 


ctgtccatcc 


agtccgccct 


gtccaaggac 


660 
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™tiGC&a/tgsRf& '(aig^CgbfciSa'^tflatggtcctg ctggagttcg tcaccgctgc cggcatcacc 720 

cacggcatgg acgagctgta caagtaaaac tggcttaaat cgttaacaat cgtgtga 777 



<210> 184 
<211> 320 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Synthetic sequence 
<400> 184 

aactggctta aatcgttaac aatcgtgtga ccgacgtcga cccactctag aggatcgatc 60 
cccgctccgt gtaaatggag gcgctcgttg atctgagcct tgccccctga cgaacggcgg 120 
tggatggaag atactgctct caagtgctga agcggtagct tagctccccg tttcgtgctg 180 
atcagtcttt ttcaacacgt aaaaagcgga ggagttttgc aattttgttg gttgtaacga 240 
tcctccgttg attttggcct ctttctccat gggcgggctg ggcgtatttg gatttaacat 300 
aactgtcgat taccgtgcga 320 
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